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Title Of Invention 

Cloned Genomes Of Infectious 
Hepatitis C Viruses And Uses Thereof 

This application claims the benefit of U,S. 
Provisional Application No. 60/053,062 filed July 18, 
1997. 

Field Of Invention 

The present invention relates to molecular 
approaches to the production of nucleic acid sequences 
which comprise the genome of infectious hepatitis C 
viruses. In particular, the invention provides nucleic 
acid sequences which comprise the genomes of infectious 
hepatitis C viruses of genotype la and lb strains. The 
invention therefore relates to the use of these sequences, 
and polypeptides encoded by all or part of these 
sequences, in the development of vaccines and diagnostic 
assays for HCV and in the development of screening assays 
for the identification of antiviral agents for HCV. 

Background Of Invention 

Hepatitis C virus (HCV) has a positive-sense 
single -strand RNA genome and is a member of the virus 
family Flavivirldae (Choo et al . , 1991; Rice, 1996). As 
for all positive -stranded RNA viruses, the genome of HCV 
functions as mRNA from which all viral proteins necessary 
for propagation are translated. 

The viral genome of HCV is approximately 9600 
nucleotides (nts) and consists of a highly conserved 5' 
untranslated region (UTR) , a single long open reading 
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frame (ORF) of approximately 9,000 nts and a complex 3' 
UTR. The 5' UTR contains an internal ribosomal entry site 
(Tsukiyama-Kohara et al . , 1992; Honda et al • , 1996), The 
3' UTR consists of a short variable region, a 
polypyrimidine tract of variable length and, at the 3' 
end, a highly conserved region of approximately 100 nts 
(Kolykhalov et al . , 1996; Tanaka et al • , 1995; Tanaka et 
al., 1996; Yamada et al . , 1996). The last 46 nucleotides 
of this conserved region were predicted to form a stable 
stem- loop structure thought to be critical for viral 
replication (Blight and Rice, 1997; Ito and Lai, 1997; 
Tsuchihara et al . , 1997) . The ORF encodes a large 
15 polypeptide precursor that is cleaved into at least 10 

proteins by host and viral proteinases (Rice, 1996) . The 
predicted envelope proteins contain several conserved N- 
linked glycosylation sites and cysteine residues (Okamoto 
et al . , 1992a). The NS3 gene encodes a serine protease 

20 

and an RNA helicase and the NS5B gene encodes an RNA- 
dependent RNA polymerase . 

Globally, six major HCV genotypes (genotypes 1- 
6) and multiple subtypes (a, b, c, etc.) have been 
25 identified (Bukh et al . , 1993; Simmonds et al., 1993). 

The most divergent HCV isolates differ from each other by 
more than 3 0% over the entire genome (Okamoto et al., 
1992a) and HCV circulates in an infected individual as a 
quasispecies of closely related genomes (Bukh et al., 
1995; Farci et al . , 1997). 

At present, more than 80% of individuals 
infected with HCV become chronically infected and these 
chronically infected individuals have a relatively high 
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risk of developing chronic hepatitis, liver cirrhosis and 
hepatocellular carcinoma (Hoofnagle, 1997). In the U.S., 
HCV genotypes la and lb constitute the majority of 
infections while in many other areas, especially in Europe 
^ and Japan, genotype lb predominates. 

The only effective therapy for chronic hepatitis 
C, interferon (IFN) , induces a sustained response in less 
than 25% of treated patients (Fried and Hoofnagle, 1995) . 

10 Consequently, HCV is currently the most common cause of 
end stage liver failure and the reason for about 30% of 
liver transplants performed in the U.S. (Hoofnagle, . 1997) . 
In addition, a number of recent studies suggested that the 
severity of liver disease and the outcome of therapy may 
be genotype -dependent (reviewed in Bukh et al . , 1997). In 
particular, these studies suggested that infection with 
HCV genotype lb was associated with more severe liver 
disease (Brechot, 1997) and a poorer response to IFN 

20 therapy (Fried and Hoofnagle, 1995) . As a result of the 
inability to develop a universally effective therapy 
against HCV infection, it is estimated that there are 
still more than 25,000 new infections yearly in the U.S. 
(Alter 1997) Moreover, since there is no vaccine for HCV, 

25 

HCV remains a serious public health problem. 

However, despite the intense interest in the 
development of vaccines and therapies for HCV, progress 
has been hindered by the absence of a useful cell culture 
30 system and the lack of any small animal model for 

laboratory study. For example, while replication of HCV 
in several cell lines has been reported, such observations 
have turned out not to be highly reproducible. In 
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addition, the chimpanzee is the only animal model, other 
than man, for this disease. Consequently, HCV has been 
able to be studied only by using clinical materials 
obtained from patients or experimentally infected 
chimpanzees (an animal model whose availability is very 
limited) . 

However, several researchers have recently 
reported the construction of infectious cDNA clones of 
10 HCV, the identification of which would permit a more 

effective search for susceptible cell lines and facilitate 
molecular analysis of the viral genes and their function. 
For example. Dash et al., (1997) and Yoo et al . , (1995) 
reported that RNA transcripts from cDNA clones of HCV-1 
(genotype la) and HCV-N (genotype lb) , respectively, 
resulted in viral replication after transfection into 
human hepatoma cell lines. Unfortunately, the viability 
of these clones was not tested in vivo and concerns were 
20 raised about the infectivity of these cDNA clones in vitro 
(Fausto, 1997) . In addition, both clones did not contain 
the terminal 98 conserved nucleotides at the very 3' end 
of the UTR. 

Kolykhalov et al., (1997) and Yanagi et al. 
(1997) reported the derivation from HCV strain H77 (which 
is genotype la) of cDNA clones of HCV that are infectious 
for chimpanzees. However, while these infectious clones 
will aid in studying HCV replication and pathogenesis and 
will provide an important tool for development of in vitro 
replication and propagation systems, it is important to 
have infectious clones of more than one genotype given the 
extensive genetic heterogeneity of HCV and the potential 
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impact of such heterogeneity on the development of 
effective therapies and vaccines for HCV. 

Summarv Of The Invention 

^ The present invention relates to nucleic acid 

sequences which comprise the genome of infectious 
hepatitis C viruses and in particular, nucleic acid 
sequences which comprise the genome of infectious 
10 hepatitis C viruses of genotype la and lb strains. It is 
therefore an object of the invention to provide nucleic 
acid sequences which encode infectious hepatitis C 
viruses. Such nucleic acid sequences are referred to 
throughout the application as "infectious nucleic acid 
sequences" . 

For the purposes of this application, nucleic 
acid sequence refers to RNA, DNA, cDNA or any variant 
thereof capable of directing host organism synthesis of a 
20 hepatitis C virus polypeptide. It is understood that 

nucleic acid sequence encompasses nucleic acid sequences, 
which due to degeneracy, encode the same polypeptide 
sequence as the nucleic acid sequences described herein.^ 

The invention also relates to the use of the 
infectious nucleic acid sequences to produce chimeric 
genomes consisting of portions of the open reading frames 
of infectious nucleic acid sequences of other genotypes 
(including, but not limited to, genotypes 1, 2, 3, 4, 5 
and 6) and subtypes (including, but not limited to, 
subtypes la, lb, 2a, 2b, 2c, 3a 4a-4f, 5a and 6a) of HCV, 
For example infectious nucleic acid sequence of the la and 
lb strains H77 and HC-J4, respectively, described herein 

35 
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can be used to produce chimeras with sequences from the 
genomes of other strains of HCV from different genotypes 
or subtypes. Nucleic acid sequences which comprise 
sequence from the open- reading frames of 2 or more HCV 
genotypes or subtypes are designated "chimeric nucleic 
acid sequences". 

The invention further relates to mutations of 
the infectious nucleic acid sequences of the invention 
10 where mutation includes, but is not limited to, point 

mutations, deletions and insertions. In one embodiment, a 
gene or fragment thereof can be deleted to determine the 
effect of the deleted gene or genes on the properties of 
the encoded virus such as its virulence and its ability to 
replicate. In an alternative embodiment, a mutation may 
be introduced into the infectious nucleic acid sequences 
to examine the effect of the mutation on the properties of 
the virus in the host cell. 
20 The invention also relates to the introduction 

of mutations or deletions into the infectious nucleic acid 
sequences in order to produce an attenuated hepatitis C 
virus suitable for vaccine development. 

The invention further relates to the use of the 
infectious nucleic acid sequences to produce attenuated 
viruses via passage in vitro or in vivo of the viruses 
produced by trans feet ion of a host cell with the 
infectious nucleic acid sequence. 
30 The present invention also relates to the use of 

the nucleic acid sequences of the invention or fragments 
thereof in the production of polypeptides where "nucleic 
acid sequences of the invention" refers to infectious 

35 
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nucleic acid sequences, mutations of infectious nucleic 
acid sequences, chimeric nucleic acid sequences and 
sequences which comprise the genome of attenuated viruses 
produced from the infectious nucleic acid sequences of the 
^ invention. The polypeptides of the invention, especially 
structural polypeptides, can serve as immunogens in the 
development of vaccines or as antigens in the development 
of diagnostic assays for detecting the presence of HCV in 

10 biological samples . 

The invention therefore also relates to vaccines 
for use in immunizing mammals especially humans against 
hepatitis C. In one embodiment, the vaccine comprises one 
or more polypeptides made from a nucleic acid sequence of 
the invention or fragment thereof. In a second 
embodiment, the vaccine comprises a hepatitis C virus 
produced by transfection of host cells with the nucleic 
acid sequences of the invention. 

20 The present invention therefore relates to 

methods for preventing hepatitis C in a mammal. In one 
embodiment the method comprises administering to a mammal 
a polypeptide or polypeptides encoded by a nucleic acid 
sequence of the invention in an amount effective to induce 
protective immunity to hepatitis C. In another 
embodiment, the method of prevention comprises 
administering to a mammal a hepatitis C virus of the 
invention in an amount effective to induce protective 

30 immunity against hepatitis C. 

In yet another embodiment, the method of 
protection comprises administering to a mammal a nucleic 
acid sequence of the invention or a fragment thereof in an 
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amount effective to induce protective immunity against 
hepatitis C. 

The invention also relates to hepatitis C 
viruses produced by host cells transfected with the 
^ nucleic acid sequences of the present invention. 

The invention therefore also provides 
pharmaceutical compositions comprising the nucleic acid 
sequences of the invention and/or their encoded hepatitis 
10 C viruses. The invention further provides pharmaceutical 
compositions comprising polypeptides encoded by the 
nucleic acid sequences of the invention or fragments 
thereof. The pharmaceutical compositions of the invention 
may be used prophylactically or therapeutically. 

The invention also relates to antibodies to the 
hepatitis C viruses of the invention or their encoded 
polypeptides and to pharmaceutical compositions comprising 
these antibodies. 
20 The present invention further relates to 

polypeptides encoded by the nucleic acid sequences of the 
invention fragments thereof. In one embodiment, said 
polypeptide or polypeptides are fully or partially 
purified from hepatitis C virus produced by cells 
transfected with nucleic acid sequence of the invention. 
In another embodiment, the polypeptide or polypeptides are 
produced recombinant ly from a fragment of the nucleic acid 
sequences of the invention. In yet another embodiment, 
30 the polypeptides are chemically synthesized. 

The invention also relates to the use of the 
nucleic acid sequences of the invention to identify cell 
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lines capable of supporting the replication of HCV in 
vitro . 

The invention further relates to the use of the 
nucleic acid sequences of the invention or their encoded 
^ proteases (e.g. NS3 protease) to develop screening assays 
to identify antiviral agents for HCV. 

Brief Description Of Figures 

10 Figure 1 shows a strategy for constructing full- 

length cDNA clones of HCV strain H77. The long PCR 
products amplified with HI and H9417R primers were cloned 
directly into pGEM-9zf (-} after digestion with Not I and 
Xba I {pH21i and pHSOj) . Next, the 3' UTR was cloned into 
both pH21j and pH50j after digestion with Afl II and Xba I 
(pH21 and pH50) . pH21 was tested for infectivity in a 
chimpanzee. To improve the efficiency of cloning, we 
constructed a cassette vector with consensus 5' and 3' 
termini of H77 . This cassette vector (pCV) was obtained 
by cutting out the BamHI fragment (nts 13 58- 753 0 of the 
H77 genome) from pH50, followed by religation. Finally, 
the long PCR products of H77 amplified with primers HI and 
H9417R (H product) or primers Al and H9417R (A product) 
were cloned into pCV after digestion with Age I and Afl II 
or with Pin AI and Bfr I. The latter procedure yielded 
multiple complete cDNA clones of strain H77 of HCV. 

Figure 2 shows the results of gel 
electrophoresis of long RT-PCR amplicons of the entire ORF 
of H77 and the transcription mixture of the infectious 
clone of H77. The complete ORF was amplified by long RT- 
PCR with the primers. HI or Al and H9417R from 10^ GE of 

35 
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H77. A total of 10 /ig of the consensus chimeric clone 
(pCV-H77C) linearized with Xba I was transcribed in a 100 
^1 reaction with T7 RNA polymerase. Five fil of the 
transcription mixture was analyzed by gel electrophoresis 
^ and the remainder of the mixture was injected into a 

chimpanzee. Lane 1, molecular weight marker ; lane 2, 
products amplified with primers HI and H9417R; lane 3, 
products amplified with primers Al and H9417R; lane 4, 
10 transcription mixture containing the RNA transcripts and 
linearized clone pCV-H77C (12.5 kb) . 

Figure 3 is a diagram of the genome organization 
of HCV strain H77 and the genetic heterogeneity of 
individual full-length clones compared with the consensus 

15 

sequence of H77. Solid lines represent aa changes. 
Dashed lines represent silent mutations. A * in pH21 
represents a point mutation at nt 58 in the 5' UTR. In 
the ORF, the consensus chimeric clone pCV-H77C had 11 nt 

^® differences tat positions 1625 (C-^T) , 2709 (T->C) , 3380 

(A-^G) , 3710 (C-»T), 3914 (G^A) , 4463 (T->C) , 5058 (C->T) , 
5834 {C->T) , 6734 (T->C) , 7154 (C~>T) , and 7202 (T-^C) ] and 
one aa change (F L at aa 790) compared with the 
consensus sequence of H77 . This clone was infectious. 
Clone pH21 and pCV-Hll had 19 nts (7 aa> and 64 nts (21 
aa) differences respectively, compared with the consensus 
sequence of H77. These two clones were not infectious. A 

30 single point mutation in the 3' UTR at nucleotide 9406 

(G->A) introduced to create an Afl II cleavage site is not 
shown . 
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Figures 4A-4F show the complete nucleotide 
sequence of a H77C clone produced according to the present 
invention and Figures 4G-4H show the amino acid sequence 
encoded by the H77C clone. 
^ Figure 5 shows an agarose gel of long RT-PCR 

amplicons and transcription mixtures. Lanes 1 and 4: 
Molecular weight marker (Lambda/Hindlll digest) . Lanes 2 
and 3: RT-PCR amplicons of the entire ORF of HC- J4 . Lane 
10 5: pCV-H77C transcription control (Yanagi et al . , 1997). 

Lanes 6, 1, and 8: 1/40 of each transcription mixture of 
pCV-J4L2S, pCV-J4L4S and pCV-J4L6S, respectively, which 
was injected into the chimpanzee. 

Figure 6 shows the strategy utilized for the 
construction of full-length cDNA clones of HCV strain HC- 
J4 . The long PGR products were cloned as two separate 
fragments (L and S) into a cassette vector (pCV) with 
fixed 5' and 3' termini of HCV (Yanagi et al . , 1997). 
20 Full-length cDNA clones of HC-J4 were obtained by 

inserting the L fragment from three pCV-J4L clones into 
three identical pCV-J4S9 clones after digestion with 
PiiiAI (isoschizomer of Agrel) and Bfrl (isoschizomer of 
Aflll) . 

Figure 7 shows amino acid positions with a 
quasispecies of HC-J4 in the acute phase plasma pool 
obtained from an experimentally infected chimpanzee. 
Cons-p9: consensus amino acid sequence deduced from 
30 analysis of nine L fragments and nine S fragments (see 

Fig. 6) . Cons-D: consensus sequence derived from direct 
sequencing of the PGR product. A, B, C: groups of similar 
viral species. Dot: amino acid identical to that in Cons- 

35 
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p9. Capital letter: amino acid different from that in 
Cons-p9. Cons~F: composite consensus amino acid sequence 
combining Cons-p9 and Cons-D. Boxed amino acid: different 
from that in Cons-F- Shaded amino acid: different from 
^ that in all species A sequences. An *: defective ORE due 
to a nucleotide deletion (clone LI, aa 1097) or insertion 
(clone L7, aa 2770). Diagonal lines: fragments used to 
construct the infectious clone. 

10 Figure 8 shows comparisons (percent difference) 

of nucleotide (nts. 156 - 8935) and predicted amino acid 
sequences (aa 1 - 2864) of L clones (species A, and C, 
this study), HC-J4/91 (Okamoto et al . , 1992b) and HC-J4/83 
(Okamoto et al . , 1992b). Differences among species A 
sequences and among species B sequences are shaded. 

Figure 9 shows UPGMA ("unweighted pair group 
method with arithmetic mean") trees of HC-J4/91 (Okamoto 
et al., 1992b), HC-J4/83 (Okamoto et al., 1992b), two 

20 prototype strains of genotype lb (HCV-J, Kato et al., 

1990; HCV-BK, Takamizawa et al . , 1991), and L clones (this 
study) . 

Figure 10 shows the alignment of the HVRl and 
HVR2 amino acid sequences of the E2 sequences of nine L 
clones of HC-J4 (species A, B, and C) obtained from an 
early acute phase plasma pool of an experimentally 
infected chimpanzee compared with the sequences of eight 
clones (HC-J4/91-20 through HC-J4/91~27, Okamoto et al . , 
30 1992b) derived from the inoculum. Dot: an amino acid 

identical to that in the top line. Capital letters: amino 
acid different from that in the top line. 
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Figure 11 shows the alignment of the 5' UTR and 
the 3' UTR sequences of infectious clones of genotype la 
{pCV-H77C) and lb (pCV-J4L6S) . Top line: consensus 
^ sequence of the indicated strain. Dot: identity with 

consensus sequence- Capital letter: different from the 
consensus sequence. Dash: deletion. Underlined: PinAI 
and Bfrl cleavage site. Numbering corresponds to the HCV 
sequence of pCV-J4L6S. 
10 Figure 12 shows a comparison of individual full- 

length cDNA clones of the ORF of HCV strain HC-J4 with 
the consensus sequence (see Fig. 7). Solid lines: amino 
acid changes- Dashed lines: silent mutations. Clone pCV- 
J4L6S was infectious in vivo whereas clones pCV-J4L2S and 
pCV-J4L4S were not • 

Figure 13 shows biochemical (ALT levels) and PCR 
analyses of a chimpanzee following percutaneous 
intrahepatic transfection with RNA transcripts of the 
20 infectious clone of pCV-J4L2S, pCV-J4L4S and pCV-J4L6S. 
The ALT serum enzyme levels were measured in units per 
liter (u/1) . For the PCR analysis, "HCV RNA" represented 
by an open rectangle indicates a serum sample that was 
negative for HCV after nested PCR; "HCV RNA" represented 
by a closed rectangle indicates that the serum sample was 
positive for HCV and HCV GE titer on the right-hand y-axis 
represents genome equivalents. 

Figures 14A-14F show the nucleotide sequence of 
the infectious clone of genotype lb strain HC-J4 and 
Figures 14G-14H show the amino acid sequence encoded by 
the HC-J4 clone. 
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Figure 15 shows the strategy for constructing a 
chimeric HCV clone designated pH77CV-J4 which contains the 
nonstructural region of the infectious clone of genotype 
la strain H77 and the structural region of the infectious 
^ clone of genotype lb strain HC- J4 . 

Figures 16A-16F show the nucleotide sequence of 
the chimeric la/lb clone pH77CV-J4 of Figure 15 and 
Figures 166- 16H show the amino acid sequence encoded by 
10 the chimeric la/lb clone. 

Figures 17A and 17B show the sequence of the 3' 
untranslated region remaining in various 3' deletion 
mutants of the la infectious clone pCV-H77C and the 
J5 strategy utilized in constructing each 3' deletion mutant 
(Figures 17C-176) . 

Of the seven deletion mutants shown, two (pCV- 
H77C(-98X) and ("42X)) have been constructed and tested 
for infect ivity in chimpanzees (see Figures 17A and 17C) 

20 

and the other six are to be constructed and tested for 
infectivity as described in Figures 17D-17G. 

Figures 18A and 18B show biochemical (ALT 
levels) , PGR (HCV RNA and HCV GE titer) , serological 
25 (ant i -HCV) and histopathological (Fig, 18B only) analyses 

of chimpanzees 1494 (Fig. ISA) and 1530 (Fig. IBB) 
following transfection with the infectious cDNA clone pCV- 
H77C. 

The ALT serum enzyme levels were measured in 

30 

units per ml (u/1) . For the PGR analysis, "HCV RNA" 
represented by an open rectangle indicates a serum sample 
that was negative for HCV after nested PGR; "HCV RNA" 
represented by a closed rectangle indicates that the serum 
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sample was positive for HCV; and HCV GE titer on the 
right-hand y-axis represents genome equivalents. 

The bar marked " ant i -HCV" indicates samples that 
were positive for anti-HCV antibodies as determined by 
^ commercial assays. The histopathology scores in Figure 
18B correspond to no histopathology (O) , mild hepatitis 
iQ) and moderate to severe hepatitis (O) . 

ORflrRIPTION OF THE INVENTION 

10 

The present invention relates to nucleic acid 
sequences which comprise the genome of an infectious 
hepatitis C virus. More specifically, the invention 
relates to nucleic acid sequences which encode infectious 
hepatitis C viruses of genotype la and lb strains. In one 
embodiment, the infectious nucleic acid sequence of the 
invention has the sequence shown in Figures 4A-4P of this 
application. In another embodiment, the infectious 

2® nucleic acid sequence has the sequence shown in Figures 

14A-14F and is contained in a plasmid construct deposited 
with the American Type Culture Collection (ATCC) on 
January 26, 1998 and having ATCC accession number 209596. 

22 The invention also relates to "chimeric nucleic 

acid sequences" where the chimeric nucleic acid sequences 
consist of open-reading frame sequences taken from 
infectious nucleic acid sequences of hepatitis C viruses 
of different genotypes or subtypes. 

In one embodiment, the chimeric nucleic acid 
sequence consists of sequence from the genome of an HCV 
strain belonging to one genotype or subtype which encodes 
structural polypeptides and sequence of an HCV strain 
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belonging to another genotype strain or subtype which 
encodes nonstructural polypeptides. Such chimeras can be 
produced by standard techniques of restriction digestion, 
PGR amplification and subcloning known to those of 
^ ordinary skill in the art. 

In a preferred embodiment, the sequence encoding 
nonstructural polypeptides is from an infectious nucleic 
acid sequence encoding a genotype la strain where the 

10 construction of a chimeric la/lb nucleic acid sequence is 
described in Example 9 and the chimeric la/lb nucleic acid 
sequence is shown in Figures 16A-16F. It is believed that 
the construction of such chimeric nucleic acid sequences 
will be of importance in studying the growth and virulence 
properties of hepatitis C virus and in the production of 
hepatitis C viruses suitable to confer protection against 
multiple genotypes of HCV. For example, one might produce 
a "multivalent" vaccine by putting epitopes from several 

20 genotypes or subtypes into one clone. Alternatively one 
might replace just a single gene from an infectious 
sequence with the corresponding gene from the genomic 
sequence of a strain from another genotype or subtype or 
create a chimeric gene which contains portions of a gene 

23 

from two genotypes or subtypes. Examples of genes which 
could be replaced or which could be made chimeric, 
include, but are not limited to, the El, E2 and NS4 genes. 

The invention further relates to mutations of 
30 the infectious nucleic acid sequences where "mutations" 
includes, but is not limited to, point mutations, 
deletions and insertions. Of course, one of ordinary 
skill in the art would recognize that the size of the 
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insertions would be limited by the ability of the 
resultant nucleic acid sequence to be properly packaged 
within the virion. Such mutation could be produced by 
techniques known to those of skill in the art such as 
site-directed mutagenesis, fusion PGR, and restriction 
digestion followed by religation. 

In one embodiment, mutagenesis might be 
undertaken to determine sequences that are important for 
viral properties such as replication or virulence. For 
example, one may introduce a mutation into the infectious 
nucleic acid sequence which eliminates the cleavage site 
between the NS4A and NS4B polypeptides to examine the 
effects on viral replication and processing of the 
polypeptide. Alternatively, one or more of the 3 amino 
acids encoded by the infectious lb nucleic acid sequence 
shown in Figures 14A-14F which differ from the HC-J4 
consensus sequence may be back mutated to the 
20 corresponding amino acid in the HC-J4 consensus sequence 
to determine the importance of these three amino acid 
changes to infectivity or virulence. In yet another 
embodiment, one or more of the amino acids from the 
noninfectious lb clones pCV-J4L2S and pCV-J4L4S which 
differ from the consensus sequence may be introduced into 
the infectious lb sequence shown in Figures 14A-14P. 

In yet another example, one may delete all or 
part of a gene or of the 5' or 3' nontranslated region 
contained in an infectious nucleic acid sequence and then 
transfect a host cell (animal or cell culture) with the 
mutated sequence and measure viral replication in the host 
by methods known in the art such as RT-PCR. Preferred 

35 
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genes include, but are not limited to, the PI, NS4B and 
NS5A genes. Of course, those of ordinary skill in the art 
will understand that deletion of part of a gene, 
preferably the central portion of the gene, may be 
^ preferable to deletion of the entire gene in order to 

conserve the cleavage site boundaries which exist between 
proteins in the HCV polyprotein and which are necessary 
for proper processing of the polyprotein, 
10 In the alternative, if the transfection is into 

a host animal such as a chimpanzee, one can monitor the 
virulence phenotype of the virus produced by transfection 
of the mutated infectious nucleic acid sequence by methods 
known in the art such as measurement of liver enzyme 
levels (alanine aminotransferase (ALT) or isocitrate 
dehydrogenase (ICD) ) or by histopathology of liver 
biopsies. Thus, mutations of the infectious nucleic acid 
sequences may be useful in the production of attenuated 
20 HCV strains suitable for vaccine use. 

The invention also relates to the use of the 
infectious nucleic acid sequences of the present invention 
to produce attenuated viral strains via passage in vitro 
or in vivo of the virus produced by transfection with the 
infectious nucleic acid sequences. 

The present invention therefore relates to the 
use of the nucleic acid sequences of the invention to 
identify cell lines capable of supporting the replication 
30 of HCV. 

In particular, it is contemplated that the 
mutations of the infectious nucleic acid sequences of the 
invention and the production of chimeric sequences as 

35 
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discussed above may be useful in identifying sequences 
critical for cell culture adaptation of HCV and hence, may 
be useful in identifying cell lines capable of supporting 
HCV replication. 
^ Transfection of tissue culture cells with the 

nucleic acid sequences of the invention may be done by 
methods of transfection known in the art such as 
electroporation, precipitation with DEAE-Dextran or 

10 calcium phosphate or liposomes. 

In one such embodiment/ the method comprises the 
growing of animal cells, especially human cells, in vitro 
and transfecting the cells with the nucleic acid of the 
invention, then determining if the cells show indicia of 
HCV infection. Such indicia include the detection of 
viral antigens in the cell, for example, by 
immunof luorescent procedures well known in the art; the 
detection of viral polypeptides by Western blotting using 

20 antibodies specific therefor; and the detection of newly 

transcribed viral RNA within the cells via methods such as 
RT-PCR. The presence of live, infectious virus particles 
following such tests may also be shown by injection of 

cell culture medium or cell lysates into healthy, 

25 . . . ^ 1. 

susceptible animals, with subsequent exhibition of the 

symptoms of HCV infection. 

Suitable cells or cell lines for culturing HCV 
include, but are not limited to, lymphocyte and hepatocyte 
30 cell lines known in the art. 

Alternatively, primary hepatocytes can be 
cultured, and then infected with HCV; or, the hepatocyte 
cultures could be derived from the livers of infected 



chimpanzees. In addition, various immortalization methods 
known to those of ordinary skill in the art can be used to 
obtain cell-lines derived from hepatocyte cultures. For 
example, primary hepatocyte cultures may be fused to a 
variety of cells to maintain stability. 

The present invention further relates to the in 
vitro and in vivo production of hepatitis C viruses from 
the nucleic acid sequences of the invention. 

In one embodiment, the sequences of the 
invention can be inserted into an expression vector that 
functions in eukaryotic cells. Eukaryotic expression 
vectors suitable for producing high efficiency gene 
transfer in vivo are well known to those of ordinary skill 
in the art and include, but are not limited to, plasmids, 
vaccinia viruses, retroviruses, adenoviruses and adeno- 
associated viruses . 

In another embodiment, the sequences contained 
in the recombinant expression vector can be transcribed in 
vitro by methods known to those of ordinary skill in the 
art in order to produce RNA transcripts which encode the 
hepatitis C viruses of the invention. The hepatitis C 
viruses of the invention may then be produced by 
transfecting cells by methods known to those of ordinary 
skill in the art with either the in vitro transcription 
mixture containing the RNA transcripts (see Example 4) or 
with the recombinant expression vectors containing the 
nucleic acid sequences described herein. 

The present invention also relates to the 
construction of cassette vectors useful in the cloning of 
viral genomes wherein said vectors comprise a nucleic acid 
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sequence to be cloned, and said vector reading in the 
correct phase for the expression of the viral nucleic acid 
to be cloned. Such a cassette vector will, of course, 
also possess a promoter sequence, advantageously placed 
^ upstream of the sequence to be expressed. Cassette 

vectors according to the present invention are constructed 
according to the procedure described in Figure 1, for 
example, starting with plasmid pCV. Of course, the DNA to 

10 be inserted into said cassette vector can be derived from 
any virus, advantageously from HCV, and most 
advantageously from the H77 strain of HCV. The nucleic 
acid to be inserted according to the present invention 
can, for example, contain one or more open reading frames 
of the virus, for example, HCV. The cassette vectors of 
the present invention may also contain, optionally, one or 
more expressible marker genes for expression as an 
indication of successful transfection and expression of 

20 the nucleic acid sequences of the vector. To insure 

expression, the cassette vectors of the present invention 
will contain a promoter sequence for binding of the 
appropriate cellular RNA polymerase, which will depend on 
the cell into which the vector has been introduced. For 

25 

example, if the host cell is a bacterial cell, then said 
promoter will be a bacterial promoter sequence to which 
the bacterial RNA polymerases will bind. 

The hepatitis C viruses produced from the 
30 sequences of the invention may be purified or partially 
purified from the transfected cells by methods known to 
those of ordinary skill in the art. In a preferred 
embodiment, the viruses are partially purified prior to 
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their use as immunogens in the pharmaceutical compositions 
and vaccines of the present invention. 

The present invention therefore relates to the 
use of the hepatitis C viruses produced from the nucleic 
^ acid sequences of the invention as immunogens in live or 
killed ( e.g. . formalin inactivated) vaccines to prevent 
hepatitis C in a mammal. 

In an alternative embodiment, the immunogen of 
10 the present invention may be an infectious nucleic acid 
sequence, a chimeric nucleic acid sequence, or a mutated 
infectious nucleic acid sequence which encodes a hepatitis 
C virus. Where the sequence is a cDNA sequence, the cDNAs 
and their RNA transcripts may be used to transfect a 
mammal by direct injection into the liver tissue of the 
mammal as described in the Examples . 

Alternatively, direct gene transfer may be 
accomplished via administration of a eukaryotic expression 
20 vector containing a nucleic acid sequence of the 
invention. 

In yet another embodiment, the immunogen may be 
a polypeptide encoded by the nucleic acid sequences of the 
invention. The present invention therefore also relates 

25 

to polypeptides produced from the nucleic acid sequences 
of the invention or fragments thereof. In one embodiment, 
polypeptides of the present invention can be recombinantly 
produced by synthesis from the nucleic acid sequences of 
30 the invention or isolated fragments thereof, and purified, 
or partially purified, from transfected cells using 
methods already known in the art. In an alternative 
embodiment, the polypeptides may be purified or partially 

35 
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purified from viral particles produced via transfection of 
a host cell with the nucleic acid sequences of the 
invention. Such polypeptides might, for example, include 
either capsid or envelope polypeptides prepared from the 
^ sequences of the present invention. 

When used as immunogens, the nucleic acid 
sequences of the invention, or the polypeptides or viruses 
produced therefrom, are preferably partially purified 
10 prior to use as immunogens in pharmaceutical compositions 
and vaccines of the present invention. When used as a 
vaccine, the sequences and the polypeptide and virus 
products thereof, can be administered alone or in a 
suitable diluent, including, but not limited to, water, 
saline, or some type of buffered medium. The vaccine 
according to the present invention may be administered to 
an animal, especially a mammal, and most especially a 
human, by a variety of routes, including, but not limited 
20 to, intradermally, intramuscularly, subcutaneously , or in 
any combination thereof. 

Suitable amounts of material to administer for 
prophylactic and therapeutic purposes will vary depending 
on the route selected and the immunogen (nucleic acid, 
virus, polypeptide) administered. One skilled in the art 
will appreciate that the amounts to be administered for 
any particular treatment protocol can be readily 
determined without undue experimentation. The vaccines of 
30 the present invention may be administered once or 

periodically until a suitable titer of anti-HCV antibodies 
appear in the blood. For an immunogen consisting of a 
nucleic acid sequence, a suitable amount of nucleic acid 

35 
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sequence to be used for prophylactic purposes might be 
expected to fall in the range of from about 100 ^ig to 
about 5 mg and most preferably in the range of from about 
500 ^ig to about 2mg. For a polypeptide, a suitable amount 
to use for prophylactic purposes is preferably 100 ng to 
100 |xg and for a virus 10^ to 10^ infectious doses. Such 
administration will, of course, occur prior to any sign of 
HCV infection. 

A vaccine of the present invention may be 
employed in such forms as capsules, liquid solutions, 
suspensions or elixirs for oral administration, or sterile 
liquid forms such as solutions or suspensions. Any inert 
carrier is preferably used, such as saline or phosphate- 
buffered saline, or any such carrier in which the HCV of 
the present invention can be suitably suspended. The 
vaccines may be in the form of single dose preparations or 
in multi-dose flasks which can be utilized for mass- 
vaccination programs of both animals and humans. For 
purposes of using the vaccines of the present invention 
reference is made to Remington's Pharmaceutical Sciences . 
Mack Publishing Co., Easton, Pa., Osol (Ed.) (1980); and 
New Trends and Developments in Vaccines . Voller et al. 
(Eds.), University Park Press, Baltimore, Md. (1978), both 
of which provide much useful information for preparing and 
using vaccines. Of course, the polypeptides of the 
present invention, when used as vaccines, can include, as 
part of the composition or emulsion, a suitable adjuvant, 
such as alum (or aluminum hydroxide) when humans are to be 
vaccinated, to further stimulate production of antibodies 
by immune cells. When nucleic acids or viruses are used 
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for vaccination purposes, other specific adjuvants such as 
CpG motifs (Krieg, A.K. et al.(1995) and (1996)), may 

prove useful. 

When the nucleic acids, viruses and polypeptides 
of the present invention are used as vaccines or inocula, 
they will normally exist as physically discrete units 
suitable as a unitary dosage for animals, especially 
mammals, and most especially humans, wherein each unit 
will contain a predetermined quantity of active material 
calculated to produce the desired immunogenic effect in 
association with the required diluent. The dose of said 
vaccine or inoculum according to the present invention is 
administered at least once. In order to increase the 
antibody level, a second or booster dose may be 
administered at some time after the initial dose. The 
need for, and timing of, such booster dose will, of 
course, be determined within the sound judgment of the 
20 administrator of such vaccine or inoculum and according to 
sound principles well known in the art. For example, such 
booster dose coiild reasonably be expected to be 
advantageous at some time between about 2 weeks to about 6 
months following the initial vaccination. Subsequent 
doses may be administered as indicated. 

The nucleic acid sequences, viruses and 
polypeptides of the present invention can also be 
administered for purposes of therapy, where a mammal, 
30 especially a primate, and most especially a human, is 
already infected, as shown by well known diagnostic 
measures. When the nucleic acid sequences, viruses or 
polypeptides of the present invention are used for such 

35 
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therapeutic purposes, much of the same criteria will apply 
as when it is used as a vaccine, except that inoculation 
will occur post -infect ion. Thus, when the nucleic acid 
sequences, viruses or polypeptides of the present 

5 

invention are used as therapeutic agents in the treatment 
of infection, the therapeutic agent comprises a 
pharmaceutical composition containing a sufficient amount 
of said nucleic acid sequences, viruses or polypeptides so 
10 as to elicit a therapeutically effective response in the 
organism to be treated. Of course, the amount of 
pharmaceutical composition to be administered will, as for 
vaccines, vary depending on the immunogen contained 
therein (nucleic acid, polypeptide, virus) and on the 
route of administration. 

The therapeutic agent according to the present 
invention can thus be administered by, subcutaneous, 
intramuscular or intradermal routes. One skilled in the 
20 art will certainly appreciate that the amounts to be 

administered for any particular treatment protocol can be 
readily determined without undue e^cperimentation. Of 
course, the actual amounts will vary depending on the 
route of administration as well as the sex, age, and 
clinical status of the subject which, in the case of human 
patients, is to be determined with the sound judgment of 
the clinician. 

The therapeutic agent of the present invention 
30 can be employed in such forms as capsules, liquid 

solutions, suspensions or elixirs, or sterile liquid forms 
such as solutions or suspensions. Any inert carrier is 
preferably used, such as saline, phosphate-buf f ered 
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saline, or any such carrier in which the HCV of the 
present invention can be suitably suspended. The 
therapeutic agents may be in the form of single dose 
preparations or in the mult i- dose flasks which can be 
utilized for mass -treatment programs of both animals and 
humans. Of course, when the nucleic acid sequences, 
viruses or polypeptides of the present invention are used 
as therapeutic agents they may be administered as a single 
dose or as a series of doses, depending on the situation 
as determined by the person conducting the treatment. 

The nucleic acids, polypeptides and viruses of 
the present invention can also be utilized in the 
production of antibodies against HCV. The term "antibody" 
is herein used to refer to immunoglobulin molecules and 
immunologically active portions of immunoglobulin 
molecules. Examples of antibody molecules are intact 
immunoglobulin molecules, substantially intact 
20 immunoglobulin molecules and portions of an immunoglobulin 
molecule, including those portions known in the art as 
Fab, F(ab')2 and F(v) as well as chimeric antibody 
molecules . 

Thus, the polypeptides, viruses and nucleic acid 
sequences of the present invention can be used in the 
generation of antibodies that immunoreact (i.e., specific 
binding between an antigenic determinant -containing 
molecule and a molecule containing an antibody combining 
site such as a whole antibody molecule or an active 
portion thereof) with antigenic determinants on the 
surface of hepatitis C virus particles. 



25 



30 



35 



^SDCXJID: <WO_9904008A2J_> 



wo 99/04008 



PCT/US98/14688 



- 28 - 

o 

The present invention therefore also relates to 
antibodies produced following immunization with the 
nucleic acid sequences, viruses or polypeptides of the 
present invention- These antibodies are typically 
^ produced by immunizing a mammal with an immunogen or 
vaccine to induce antibody molecules having 
immunospecif icity for polypeptides or viruses produced in 
response to infection with the nucleic acid sequences of 

10 the present invention. When used in generating such 
antibodies, the nucleic acid sequences, viruses, or 
polypeptides of the present invention may be linked to 
some type of carrier molecule. The resulting antibody 
molecules are then collected from said mammal. Antibodies 
produced according to the present invention have the 
unique advantage of being generated in response to 
authentic, functional polypeptides produced according to 
the actual cloned HCV genome. 

20 The antibody molecules of the present invention 

may be polyclonal or monoclonal. Monoclonal antibodies 
are readily produced by methods well known in the art. 
Portions of immunoglobin molecules, such as Fabs, as well 
as chimeric antibodies, may also be produced by methods 
well known to those of ordinary skill in the art of 
generating such antibodies. 

The antibodies according to the present 
invention may also be contained in blood plasma, serum, 

30 hybridoma supernatants , and the like. Alternatively, the 
antibody of the present invention is isolated to the 
extent desired by well known techniques such as, for 
example, using DEAE Sephadex. The antibodies produced 
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according to the present invention may be further purified 
so as to obtain specific classes or subclasses of antibody 
such as IgM, IgG, IgA, and the like. Antibodies of the 
IgG class are preferred for purposes of passive 
^ protection. 

The antibodies of the present invention are 
useful in the prevention and treatment of diseases caused 
by hepatitis C virus in animals, especially mammals, and 
10 most especially humans . 

In providing the antibodies of the present 
invention to a recipient mammal, preferably a human, the 
dosage of administered antibodies will vary depending on 
such factors as the mammal's age, weight, height, sex, 
general medical condition, previous medical history, and 
the like. 

In general, it will be advantageous to provide 
the recipient mammal with a dosage of antibodies in the 
20 range of from about 1 mg/kg body weight to about 10 mg/kg 
body weight of the mammal, although a lower or higher dose 
may be administered if found desirable. Such antibodies 
will normally be administered by intravenous or 
intramuscular route as an inoculum. The antibodies of the 
present invention are intended to be provided to the 
recipient subject in an amount sufficient to prevent, 
lessen or attenuate the severity, extent or duration of 
any existing infection. 
30 The antibodies prepared by use of the nucleic 

acid sequences, viruses or polypeptides of the present 
invention are also highly useful for diagnostic purposes. 
For example, the antibodies can be used as in vitro 
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diagnostic agents to test for the presence of HCV in 
biological samples taken from animals, especially humans. 
Such assays include, but are not limited to, 
radioimmunoassays, EIA, fluorescence. Western blot 
^ analysis and ELISAs. In one such embodiment, the 

biological sample is contacted with antibodies of the 
present invention and a labeled second antibody is used to 
detect the presence of HCV to which the antibodies are 
10 bound. 

Such assays may be, for example, a direct 
protocol (where the labeled first antibody is 
immunoreactive with the antigen, such as, for example, a 
polypeptide on the surface of the virus) , an indirect 
protocol (where a labeled second antibody is reactive with 
the first antibody) , a competitive protocol (such as would 
involve the addition of a labeled antigen) , or a sandwich 
protocol (where both labeled and unlabeled antibody are 
20 used) , as well as other protocols well known and described 
in the art . 

In one embodiment, an immunoassay method would 
utilize an antibody specific for HCV envelope determinants 
and would further comprise the steps of contacting a 
biological sample with the HCV- specific antibody and then 
detecting the presence of HCV material in the test sample 
using one of the types of assay protocols as described 
above. Polypeptides and antibodies produced according to 
30 the present invention may also be supplied in the form of 
a kit, either present in vials as purified material, or 
present in compositions and suspended in suitable diluents 
as previously described. 
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In a preferred embodiment, such a diagnostic 
test kit for detection of HCV antigens in a test sample 
comprises in combination a series of containers, each 
container a reagent needed for such assay. Thus, one such 
container would contain a specific amount of HCV- specific 
antibody as already described, a second container would 
contain a diluent for suspension of the sample to be 
tested, a third container would contain a positive control 
and an additional container would contain a negative 
control. An additional container could contain a blank. 

For all prophylactic, therapeutic and diagnostic 
uses, the antibodies of the invention and other reagents, 
plus appropriate devices and accessories, may be provided 
in the form of a kit so as to facilitate ready 
availability and ease of use. 

The present invention also relates to the use of 
nucleic acid sequences and polypeptides of the present 
20 invention to screen potential antiviral agents for 

antiviral activity against HCV. Such screening methods 
are known by those of skill in the art. Generally, the 
antiviral agents are tested at a variety of 
concentrations, for their effect on preventing viral 
replication in cell culture systems which support viral 
replication, and then for an inhibition of infectivity or 
of viral pathogenicity (and a low level of toxicity) in an 
animal model system. 
30 In one embodiment, animal cells (especially 

human cells) transfected with the nucleic acid sequences 
of the invention are cultured in vitro and the cells are 
treated with a candidate antiviral agent (a chemical, 
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peptide etc.) for antiviral activity by adding the 
candidate agent to the medium. The treated cells are then 
exposed, possibly under transfecting or fusing conditions 
known in the art, to the nucleic acid sequences of the 
^ present invention. A sufficient period of time would then 
be allowed to pass for infection to occur, following which 
the presence or absence of viral replication would be 
determined versus untreated control cells by methods known 
10 to those of ordinary skill in the art. Such methods 

include, but are not limited to, the detection of viral 
antigens in the cell, for example, by immunof luorescent 
procedures well known in the art; the detection of viral 
polypeptides by Western blotting using antibodies specific 
therefor; the detection of newly transcribed viral RNA 
within the cells by RT-PCR; and the detection of the 
presence of live, infectious virus particles by injection 
of cell culture medium or cell lysates into healthy, 
20 susceptible animals, with subsequent exhibition of the 
symptoms of HCV infection. A comparison of results 
obtained for control cells (treated only with nucleic acid 
sequence) with those obtained for treated cells (nucleic 
acid sequence and antiviral agent) would indicate, the 

23 

degree, if any, of antiviral activity of the candidate 
antiviral agent. Of course, one of ordinary skill in the 
art would readily understand that such cells can be 
treated with the candidate antiviral agent either before 
30 or after exposure to the nucleic acid sequence of the 
present invention so as to determine what stage, or 
stages, of viral infection and replication said agent is 
effective against . 
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In an alternative embodiment, a protease such as 
NS3 protease produced from a nucleic acid sequence of the 
invention may be used to screen for protease inhibitors 
which may act as antiviral agents. The structural and 
nonstructural regions of the HCV genome, including 
nucleotide and amino acid locations, have been determined, 
for example, as depicted in Houghton, M. (1996), Fig. 1; 
and Major, M.E- et al- (1997), Table 1. 

Such above-mentioned protease inhibitors may 
take the form of chemical compounds or peptides which 
mimic the known cleavage sites of the protease and may be 
screened using methods known to those of skill in the art 
(Houghton, M. (1996) and Major, M.E. et al . (1997)). For 
example, a substrate may be employed which mimics the 
protease 's natural substrate, but which provides a 
detectable signal (e.g. by fluorimetric or colorimetric 
methods) when cleaved. This substrate is then incubated 
20 with the protease and the candidate protease inhibitor 
under conditions of suitable pH, temperature etc. to 
detect protease activity. The proteolytic activities of 
the protease in the presence or absence of the candidate 
inhibitor are then determined. 

In yet another embodiment, a candidate antiviral 
agent (such as a protease inhibitor) may be directly 
assayed in vivo for antiviral activity by administering 
the candidate antiviral agent to a chimpanzee transfected 
30 with a nucleic acid sequence of the invention and then 
measuring viral replication in vivo via methods such as 
RT-PCR. Of course, the chimpanzee may be treated with the 
candidate agent either before or after transfection with 
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the infectious nucleic acid sequence so as to determine 
what stage, or stages, of viral infection and replication 
the agent is effective against. 

The invention also provides that the nucleic 
^ acid sequences, viruses and polypeptides of the invention 
may be supplied in the form of a kit, alone or in the form 
of a pharmaceutical composition. 

All scientific publication and/or patents cited 
10 herein are specifically incorporated by reference. The 
following examples illustrate various aspects of the 
invention but are in no way intended to limit the scope 
thereof . 

15 EXAMPLES 

MATERIALS AND METHODS 
For Exsusples 1-4 

Collection of Virus 

Hepatitis C virus was collected and used as a 
source for the RNA used in generating the cDNA clones 
according to the present invention. Plasma containing 
strain H77 of HCV was obtained from a patient in the acute 
phase of transfusion-associated non-A, non-B hepatitis 
(Feinstone et al (1981) ) . Strain H77 belongs to genotype 
la of HCV (Ogata et al (1991), Inchauspe et al (1991)). 
The consensus sequence for most of its genome has been 
determined (Kolyakov et al (1996) , Ogata et al (1991) , 
Inchauspe et al (1991) and Farci et al (1996)). 
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RNA Purification 

Viral RNA was collected and purified by 
conventional means. In general, total RNA from 10 fil of 
5 H77 plasma was extracted with the TRIzol system (GIBCO 

BRL) • The RNA pellet was resuspended in 100 /xl of 10 mM 
dithiothreitol (DTT) with 5% (vol/vol) RNasin (20 - 40 
units//xl) (available from Promega) and 10 /il aliquots were 

stored at -80°C, In subsequent experiments RT-PCR was 
10 ^. ^ 

performed on RNA equivalent to 1 fil of H77 plasma, which 

contained an estimated 10^ genome equivalents (GE) of HCV 
(Yanagi et al (1996)). 

Primers used in the RT-PCR process were deduced 
15 from the genomic sequences of strain H77 according to 

procedures already known in the art (see above) or else 
were determined specifically for use herein. The primers 
generated for this purpose are listed in Table 1. 
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Table 1. Oligonucleotides used for PGR 
amplification of strain H77 of HCV 



Designation Sequence (5' -^3')* 



H9261F 

H3'X58R 

H9282F 

H3'X45R 

H9375F 

H3'X-35R 

H9386F 

H3'X-38R 

HI 



Al 

H9417R 



GGCTACAGCGGGGGGAGACATTTATCACAGC 
TCATGCGGCTCACGGACCTTTCACAGCTAG 
OTC CAAGCTTA TCACAGCGTGTCTCATGCCCGGCCCCG 
CGTC TCTAGA GGACCTTTCACAGCTAGCCGTGACTAGGG 
TGAAGGTT.GGGGTAAACACTCCGGCCTCTTAGGCCATT 
ACATGATCTGCAGAGAGGCCAGTATCAGCACTCTC 
GTCC AAGCTTACGCGTA AACACTCCGGCCTC CTTAAG CCATTCCTG 
CGT CTCTAGA CATGATCTGCAGAGAGGCCAGTATCAGCACTCTCTGC 
TCTTTTT TGCGGCCGC TAArACSACrCACrArAGCCAGCCCCCTGAT- 
GGGGGCGACACTCCACCATG 
ACTGTCTTCACGCAGAAAGCGTCTAGCCAT 

CGTCTCTAGACAGGAAATGGCTTAAGAGGCCGGAGTGTTTACC 

* HCV sequences are shovm in plain text, non-HCV- specif ic 
seqfuences are shown in boldface and artificial cleavage sites 
used for cDNA cloning are underlined. The core sequenceof the 

T7 promoter in primer HI is shovm in italics. 

Primers for long RT-PCR were size-purified, 

cDNA Synthesis 

The RNA was denatured at 65°C for 2 min, and 
cDNA synthesis was performed in a 20 fxl reaction volume 
with Superscript II reverse transcriptase (from GIBCO/BRL) 
at 42 ®C for 1 hour using specific antisense primers as 
described previously (Tellier et al (1996)). The cDNA 
mixture was treated with RNase H and RNase Tl (GIBCO/BRL) 
for 20 min at 37 *^C, 

Amplification and Cloning of the 3' UTR 

The 3' UTR of strain H77 was amplified by PCR 
in two different assays. In both of these nested PCR 
reactions the first round of PCR was performed in a total 
volume of 50 ^1 in 1 X buffer, 250 ^mol of each 
deoxynucleoside triphosphate (dNTP; Pharmacia) , 20 pmol 
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each of external sense and antisense primers, 1 fil of the 
Advantage KlenTaq polymerase mix (from Clontech) and 2 /il 
of the final cDNA reaction mixture. In the second round 
of PGR, 5 ^1 of the first round PGR mixture was added to 
^ 45 ^1 of PGR mixture prepared as already described. Each 
round of PGR (35 cycles) , which was performed in a Perkin 
Elmer DNA thermal cycler 480, consisted of denaturation at 
94 ""C for 1 min (in 1st cycle 1 min 30 sec), annealing at 
10 eC^G for 1 min and elongation at SB^'C for 2 min. In one 
experiment a region from NS5B to the conserved region of 
the 3' UTR was amplified with the external primers H9261F 
and H3'X58R, and the internal primers H9282F and H3'X45R 
(Table 1) • In another experiment, a segment of the 
variable region to the very end of the 3' UTR was 
amplified with the external primers H9375F and H3'X-35R, 
and the internal primers H9386F and H3'X-38R (Table 1, 
Fig. 1) . Amplified products were purified with QIAquick 
PGR purification kit (from QIAGEN) , digested with Hind III 
and Xba I (from Promega) , purified by either gel 
electrophoresis or phenol/chloroform extraction, and then 
cloned into the multiple cloning site of plasmid pGEM- 
25 9zf(-) (Promega) or pUG19 (Pharmacia). Cloning of cDNA 

into the vector was performed with T4 DNA ligase (Promega) 
by standard procedures. 
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Amplification of Near Full^Lenat h H77 Genomes bv Long PGR 

The reactions were performed in a total volume 

of 50 /xl in 1 X buffer, 250 fimol of each dNTP, 10 pmol 

each of sense and antisense primers, 1 fil of the Advantage 
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KlenTaq polymerase mix and 2 /xl of the cDNA reaction 
mixture (Tellier et al (1996) ) . A single PGR round of 35 
cycles was performed in a Robocycler thermal cycler (from 
Stratagene) , and consisted of denaturation at 99 for 35 
sec, annealing at 67 **C for 3 0 sec and elongation at 68 
for 10 min during the first 5 cycles, 11 min during the 
next 10 cycles, 12 min during the following 10 cycles and 
13 min during the last 10 cycles. To amplify the complete 
^0 ORF of HCV by long RT-PCR we used the sense primers HI or 
Al deduced from the 5' UTR and the antisense primer H9417R 
deduced from the variable region of the 3' UTR (Table 1, 
Fig. 1) . 
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Construction of Full -Length H77 cDNA Clones 

The long PCR products amplified with HI and 
H9417R primers were cloned directly into pGEM-9zf(-) after 
digestion with Not I and Xba. I (from Promega) (as per 
Fig. 1) . Two clones were obtained with inserts of the 
expected size, pH21j and pHSOj. Next, the chosen 3' UTR 
was cloned into both pH21j and pHSOj after digestion with 
Afl II and Xba 1 (New England Biolabs) • DHSa competent 
cells (GIBCO/BRL) were transformed and selected with LB 
agar plates containing 100 /xg/ml ampicillin (from SIGMA) . 
Then the selected colonies were cultured in LB liquid 
containing ampicillin at 30°C for -18-20 hrs 
(transf ormants containing full-length or near full-length 
cDNA of H77 produced a very low yield of plasmid when 
cultured at 37 "^C or for more than 24 hrs) . After small 
scale preparation (Wizard Plus Minipreps DNA Purification 
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Systems, Promega) each plasmid was retransf ormed to select 
a single clone, and large scale preparation of plasmid DNA 
was performed with a QIAGEN plasmid Maxi kit. 

5 Cloning of Long RT-PCR Product s Into a Cassette Vector 

To improve the efficiency of cloning, a vector 
with consensus 5' and 3' termini of HCV strain H77 was 
constructed (Fig. 1) . This cassette vector (pCV) was 

^0 obtained by cutting out the BanHil fragment (nts 1358 - 

7530 of the H77 genome) from pH50, followed by religation. 
Next, the long PGR products of H77 amplified with HI and 
H9417R or Al and H9417R primers were purified (Geneclean 

j2 spin kit; BIO 101) and cloned into pCV after digestion 

with Age I and Afl II (New England Biolabs) or with Pin AI 
(isoschizomer of Age I) and Bfr I (isoschizomer of Afl II) 
(Boehringer Mannheim) . Large scale preparations of the 
plasmids containing full-length cDNA of H77 were performed 
as described above. 

Construction of H77 Consensus Chimeric cDNA Clone 

A full-length cDNA clone of H77 with an ORF 
25 encoding the consensus amino acid sequence was constructed 
by making a chimera from four of the cDNA clones obtained 
above. This consensus chimera, pCV-H77C, was constructed 
in two ligation steps by using standard molecular 
procedures and convenient cleavage sites and involved 

30 

first a two piece ligation and then a three piece 
ligation. Large scale preparation of pCV-H77C was 
performed as already described. 
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In Vitro Transcription 

Plasmids containing the full-length HCV cDNA 
were linearized with Xba I (from Promega) , and purified by 
5 phenol/chloroform extraction and ethanol precipitation. A 
100 fil reaction mixture containing 10 /xg of linearized 
plasmid DNA, 1 x transcription buffer, 1 mM ATP, CTP, GTP 
and UTP, lOmM DTT , 4% (v/v) RNasin (20-40 units//xl) and 2 
ixl of T7 RNA polymerase (Promega) was incubated at 37 ®C 

10 

for 2 hrs. Five ill of the reaction mixture was analyzed 
by agarose gel electrophoresis followed by ethidium 
bromide staining. The transcription reaction mixture was 
diluted with 400 /il of ice-cold phosphate-buffered saline 
15 without calcium or magnesium, immediately frozen on dry 

ice and stored at -80 °C. The final nucleic acid mixture 
was injected into chimpanzees within 24 hrs. 

Intrahepatic Transfection of Chimpanzees 

20 

Laparotomy was performed and aliquot s from two 
transcription reactions were injected into 6 sites of the 
exposed liver (Emerson et al (1992) . Serum samples were 
collected weekly from chimpanzees and monitored for liver 

23 

enzyme levels and anti-HCV antibodies. Weekly samples of 
100 ^1 of serum were tested for HCV RNA in a highly 
sensitive nested RT-PCR assay with ArapliTaq Gold (Perkin 
Elmer) (Yanagi et al (1996); Bukh et al (1992)). The 
30 genome titer of HCV was estimated by testing 10-fold 

serial dilutions of the extracted RNA in the RT-PCR assay 
(Yanagi et al (1996) ) • The two chimpanzees used in this 
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Study were maintained under conditions that met all 
requirements for their use in an approved facility. 

The consensus sequence of the complete ORF from 
HCV genomes recovered at week 2 post inoculation (p.i) 
^ was determined by direct sequencing of PGR products 
obtained in long RT-PCR with primers Al and H9417R 
followed by nested PGR of 10 overlapping fragments. The 
consensus sequence of the variable region of the 3' UTR 
10 was determined by direct sequencing of an amplicon 

obtained in nested RT-PCR as described above. Finally, we 
amplified selected regions independently by nested RT-PGR 
with AmpliTaq Gold. 

15 Sequence Analysis 

Both strands of DNA from PGR products, as well 
as plasmids, were sequenced with the ABI PRISM Dye 
Termination Gycle Sequencing Ready Reaction Kit using Taq 

20 DNA polymerase (Perkin Elmer) and about 100 specific sense 
and antisense sequence primers. 

The consensus sequence of HCV strain H77 was 
determined in two different ways. In one approach, 

25 overlapping PGR products were directly sequenced, and 
amplified in nested RT-PCR from the H77 plasma sample. 
The sequence analyzed (nucleotides (nts) 35-9417) included 
the entire genome except the very 5' and 3' termini. In 
the second approach, the consensus sequence of nts 157- 

^® 9384 was deduced from the sequences of 18 full-length cDNA 
clones . 



35 



4SCX3CID: <WO_9904008A2J_> 



wo 99/04008 



PCT/US98/14688 



10 



15 



25 



30 



- 42 



EXAMPLE 1 



Variability in the sequence of the 3' UTR of HCV strain 
H77 

The heterogeneity of the 3' UTR was analyzed by 
cloning and sequencing of DNA amplicons obtained in nested 
RT-PCR. 19 clones containing sequences of the entire 
variable region, the poly U-UC region and the adjacent 19 
nt of the conserved region, and 65 clones containing 
sequences of the entire poly U-UC region and the first 63 
nts of the conserved region were analyzed. This analysis 
confirmed that the variable region consisted of 43 nts, 
including two conserved termination codons (Han et al 
(1992)) • The sequence of the variable region was highly 
conserved within H77 since only 3 point mutations were 
found among the 19 clones analyzed. A poly U-UC region 
was present in all 84 clones analyzed. However, its 
2Q length varied from 71-141 nts. The length of the poly U 
region was 9-103 nts, and that of the poly UC region was 
35-85 nts. The number of C residues increased towards the 
3' end of the poly UC region but the sequence of this 
region is not conserved. The first 63 nts of the 
conserved region were highly conserved among the clones 
analyzed, with a total of only 14 point mutations. To 
confiirm the validity of the analysis, the 3' UTR was 
reamplified directly from a full-length cDNA clone of HCV 
(see below) by the nested- PCR procedure with the primers 
in the variable region and at the very 3' end of the HCV 
genome and cloned the PCR product. Eight clones had 1-7 
nt deletions in the poly U region. Furthermore, although 
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the C residues of the poly UC region were maintained, the 
spacing of these varied because of 1-2 nt deletions of U 
residues. These deletions must be artifacts introduced by 
PCR and such mistakes may have contributed to the 
^ heterogeneity originally observed in this region. 

However, the conserved region of the 3' UTR was amplified 
correctly, suggesting that the deletions were due to 
difficulties in transcribing a highly repetitive sequence. 
10 One of the 3' UTR clones was selected for 

engineering of full-length cDNA clones of H77 . This clone 
had the consensus variable sequence except for a single 
point mutation introduced to create an Afl II cleavage 
site, a poly U-UC stretch of 81 nts with the most commonly 
observed UC pattern and the consensus sequence of the 
complete conserved region of 101 nts, including the distal 
38 nts which originated from the antisense primer used in 
the amplification. After linearization with Xba I, the 
DNA template of this clone had the authentic 3' end. 

EXAMPLE 2 
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The Entire Open Reading Frame of H77 
fitnpl -if if»d in One Roun d of Long RT-PCR 

It had been previously demonstrated that a 9.25 
kb fragment of the HCV genome from the 5' UTR to the 3' 
end of NS5B could be amplified from lO* GE (genome 
equivalents) of H77 by a single round of long RT-PCR 
(Tellier et al (1996a) ) . In the current study, by 
optimizing primers and cycling conditions, the entire ORF 
of H77 was amplified in a single round of long RT-PCR with 
primers from the 5' UTR and the variable region of the 3' 
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UTR. In fact, 9.4 kb of the H77 genome (H product: from 
the very 5' end to the variable region of the 3' UTR) could 
be amplified from 10^ GE or 9.3 kb (A product: from within 
the 5' UTR to the variable region of the 3' UTR) from 10^ 
GE or 10^ GE, in a single round of long RT-PCR (Fig- 2) • 
The PGR products amplified from 10^ GE of H77 were used for 
engineering full-length cDNA clones (see below) . 

EXAMPLE 3 

Construction of Multiple Full -Length 
cDNA Clones of H77 in a Single Step bv 
Cloning of Long RT-PCR Amplicons Directly 
into a Cassette Vecto r with Fixed 5' and 3^ Termini 



Direct cloning of the long PGR products (H) , 
which contained a 5' T7 promoter, the authentic 5' end, the 
entire ORF of H77 and a short region of the 3' UTR, into 
pGEM-92f (-) vector by Not I and Xha I digestion was first 
attempted. However, among the 70 clones examined all but 
two had inserts that were shorter than predicted. Sequence 
analysis identified a second Not I site in the majority of 
clones, which resulted in deletion of the nts past 
position 9221. Only two clones (pH21j and pH50j) were 
missing the second Not I site and had the expected 5' and 
3' sequences of the PGR product. Therefore, full-length 
cDNA clones (pH21 and pH50) were constructed by inserting 
30 the chosen 3' UTR into pH21i and pHSOj, respectively. 

Sequence analysis revealed that clone pH21 had a complete 
full-length sequence of H77; this clone was tested for 
infectivity. The second clone, pH50, had one nt deletion 
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in the ORF at position 63 65; this clone was used to make a 
cassette vector. 

The complete ORF was amplified by constructing a 

cassette vector with fixed 5' and 3' termini as an 
^ intermediate of the full-length cDNA clones. This vector 
(pCV) was constructed by digestion of clone pH50 with 
BairiHl, followed by religation, to give a shortened plasmid 
readily distinguished from plasmids containing the full- 
10 length insert. Attempts to clone long RT-PCR products (H) 
into pCV by Age I and Afl II yielded only 1 of 23 clones 
with an insert of the expected size. In order to increase 
the efficiency of cloning, we repeated the procedure but 
used Pin A I and Bfr I instead of the respective 
isoschizomers Age I and Afl II. By this protocol, 24 of 
31 H clones and 30 of 35 A clones had the full-length cDNA 
of H77 as evaluated by restriction enzyme digestion. A 
total of 16 clones, selected at random, were each 
20 retransfoirmed, and individual plasmids were purified and 
completely sequenced. 

EXAMPLE 4 

25 Demonstration of Infectious Nature 

of Transcripts of a cDNA Clone 
Rg> pyesentina the Consensus Sequen ce of Strain H77 

A consensus chimera was constructed from 4 of 
the full-length cDNA clones with just 2 ligation steps. 
The final construct, pCV-H77C, had 11 nt differences from 
the consensus sequence of H77 in the ORF (Fig. 3) . 
However, 10 of these nucleotide differences represented 
silent mutations. The chimeric clone differed from the 
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consensus sequence at only one amino acid [L instead of F 
at position 790] . Among the 18 ORFs analyzed above, the F 
residue was found in 11 clones and the L residue in 7 
clones. However, the L residue was dominant in other 
^ isolates of genotype la, including a first passage of H77 
in a chimpanzee (Inchauspe et al (1991) ) . 

To test the infectivity of the consensus 
chimeric clone of H77 intrahepatic transfection of a 
10 chimpanzee was performed. The pCV-H77C clone was 

linearized with Xba I and transcribed in vitro by T7 RNA 
polymerase (Fig, 2) . The transcription mixture was next 
injected into 6 sites of the liver of chimpanzee 1530. 
The chimpanzee became infected with HCV as measured by 
detection of 10^ GE/ml of viral genome at week 1 p.i. 
Furthermore, the HCV titer increased to 10* GE/ml at week 
2 p.i., and reached 10^ GE/ml by week 8 p.i. The viremic 
pattern observed in the early phase of the infection with 
20 the recombinant virus was similar to that observed in 

chimpanzees inoculated intravenously with strain H77 or 
other strains of HCV (Shimizu (1990)). 

The sequence of the HCV genomes from the serum 
sample collected at week 2 p.i. was analyzed. The 
consensus sequence of nts 298-9375 of the recovered 
genomes was determined by direct sequencing of PGR 
products obtained in long RT-PCR followed by nested PGR of 
10 overlapping fragments. The identity to clone pCV-H77C 
30 sequence was 100%. The consensus sequence of nts 96- 
291,1328-1848, 3585-4106, 4763-5113 and 9322-9445 was 
determined from PGR products obtained in different nested 
RT-PCR assays. The identity of these sequences with pGV- 
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H77C was also 100%. These latter regions contained 4 
mutations unique to the consensus chimera, including the 
artificial Afl II cleavage site in the 3' UTR. Therefore, 
RNA transcripts of this clone of HCV were infectious. 
^ The infectious nature of the consensus chimera 

indicates that the regions of the 5' and 3' UTRs 
incorporated into the cassette vector do not destroy 
viability. This makes it highly advantageous to use the 
10 cassette vector to construct infectious cDNA clones of 

other HCV strains when the consensus sequence for each ORF 

is inserted. 

In addition, two complete full-length clones 
(dubbed pH21 and pCV-Hll) constructed were not infectious, 
as shown by intrahepatic injection of chimpanzees with the 
corresponding RNA transcripts. Thus, injection of the 
transcription mixture into 3 sites of the exposed liver 
resulted in no observable HCV replication and weekly serum 
2® samples were negative for HCV RNA at weeks 1 - 17 p. i. in 
a highly sensitive nested RT-PCR assay. The cDNA template 
injected along with the RNA transcripts was also not 
detected in this assay. 

Moreover, the chimpanzee remained negative for 
antibodies to HCV throughout the follow-up. Subsequent 
sequence analysis revealed that 7 of 16 additional clones 
were defective for polyprotein synthesis and all clones 
had multiple amino acid mutations compared with the 
consensus sequence of the parent strain. For example, 
clone pH21, which was not infectious, had 7 amino acid 
substitutions in the entire predicted polyprotein compared 
with the consensus sequence of H77 (Fig. 3) . The most 
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notable mutation was at position 1026, which changed L to 

altering the cleavage site between NS2 and NS3 (Reed 
(1995)). Clone pCV-Hll, also non~inf ectious , had 21 amino 
acid substitutions in the predicted polyprotein compared 
with the consensus sequence of H77 (Fig. 3) . The amino 
acid mutation at position 564 eliminated a highly- 
conserved C residue in the E2 protein (Okamoto (1992a)). 

EXAMPLE 4 A 

The chimpanzee of Example 4, designated 1530, 
was monitored out to 32 weeks p.i. for serum enzyme levels 
(ALT) and the presence of anti-HCV antibodies, HCV RNA, 
and liver histopathology . The results are shown in Figure 
15 18B. 

A second chimp, designated 1494, was also 
transfected with RNA transcripts of the pCV-H77C clone and 
monitored out to 17 weeks p.i. for the presence of anti- 
HCV antibodies, HCV RNA and elevated serum enzyme levels. 
The results are shown in Figure 18A. 

MATERIALS AND METHODS 
for Examples 5-10 
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Source Of HCV Genotype lb 

An infectious plasma pool (second chimpanzee 
passage) containing strain HC-J4, genotype lb, was 
prepared from acute phase plasma of a chimpanzee 
experimentally infected with serum containing HC-J4/91 
(Okamoto et al,, 1992b). The HC-J4/91 sample was obtained 
from a first chimpanzee passage during the chronic phase 
of hepatitis C about 8 years after experimental infection. 
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The consensus sequence of the entire genome, except for 
the very 3' end, was determined previously for HC-J4/91 
(Okamoto et al., 1992b). 

5 Preparation Of HCV RNA 

Viral RNA was extracted from 100 (xl aliquots of 
the HC-J4 plasma pool with the TRIzol system (GIBCO BRL) , 
The RNA pellets were each resuspended in 10 /il of 10 mM 
10 dithiothreitol (DTT) with 5% (vol/vol) RNasin (20-40 

units/Ml) (Promega) and stored at -8 0**C or immediately 
used for cDNA synthesis. 
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Amolif ica^-ion And Cloning Of The 3' UTR 



A region spanning from NS5B to the conserved 
region of the 3' UTR was amplified in nested RT-PCR using 
the procedure of Yanagi et al . , (1997) . 
20 In brief, the RNA was denatured at SS'^C for 2 

minutes, and cDNA was synthesized at 42**C for 1 hour with 
Superscript II reverse transcriptase (GIBCO BRL) and 
primer H3'X58R (Table 1) in a 20 ^Ll reaction volume. The 
CDNA mixture was treated with RNase H and RNase Tl (GIBCO 
BRL) at 37'*C for 2 0 minutes. The first round of PCR was 
performed on 2 ^1 of the final cDNA mixture in a total 
volume of 50 /xl with the Advantage cDNA polymerase mix 
(Clontech) and external primers H9261F (Table 1) and 
H3'X58R (Table 1) . In the second round of PCR [internal 
primers H9282F (Table 1) and H3'X45R (Table 1)], 5 /il of 
the first round PCR mixture was added to 45 /il of the PCR 
reaction mixture. Each round of PCR (35 cycles) , was 
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performed in a DNA thermal cycler 4 80 (Perkin Elmer) and 
consisted of denaturation at 94°C for 1 minute (1st cycle: 
1 minute 3 0 sec) , annealing at 60**C for 1 minute and 
J elongation at 68*^0 for 2 minutes. After purification with 
QIAquick PGR purification kit (QIAGEN) , digestion with 
Hindi I I and Xbal (Promega) , and phenol /chloroform 
extraction, the amplified products were cloned into 
pGEM-9zf(-) (Promega) (Yanagi et al . , 1997). 

10 

Amplification And Cloning Of The Entire ORF 

A region from within the 5' UTR to the variable 
region of the 3' UTR of strain HC-J4 was amplified by long 
RT-PCR (Fig. 1) (Yanagi et al., 1997). The cDNA was 
synthesized at 42°C for 1 hour in a 20 /xl reaction volume 
with Superscript II reverse transcriptase and primer J4- 
94 05R (5'-GCCTATTGGCCTGGAGTGGTTAGCTC-3') , and treated with 

20 RNases as above. The cDNA mixture (2 iil) was amplified by 
long PGR with the Advantage cDNA polymerase mix and 
primers Al (Table 1) (Bukh et al . , 1992; Yanagi et al., 
1997) and J4-9398R (5'- 

25 AGGATGGCCTTAAGGCCTGGAGTGGTTAGCTCCCCGTTCA-3') , Primer J4- 

9398R contained extra bases (bold) and an artificial Aflll 
cleavage site (underlined) . A single PGR round was 
performed in a Robocycler thermal cycler (Stratagene) , and 
consisted of denaturation at 99°C for 35 seconds, 

30 

annealing at 67°C for 30 seconds and elongation at 68**C 
for 10 minutes during the first 5 cycles, 11 minutes 
during the next 10 cycles, 12 minutes during the following 
10 cycles and 13 minutes during the last 10 cycles. 

35 
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After digesting the long PGR products obtained 
from strain HC-J4 with PinAI (isoschizomer of Agel) and 
Bfrl (isoschizomer of Aflll) (Boehringer Mannheim) , 
attempts were made to clone them directly into a cassette 
^ vector (pCV) , which contained the 5' and 3' termini of 
strain H77 (Figure 1) but no full-length clones were 
obtained. Accordingly, to improve the efficiency of 
cloning, the PGR product was further digested with Bglll 

10 (Boehringer Mannheim) and the two resultant genome 

fragments [L fragment: PinAI/Bglll, nts 156 - 8935; S 
fragment: Bglll/Brfl, nts 8936 - 9398] were separately 
cloned into pGV (Figure 6) . 

DH5a competent cells (GIBGO BRL) were 
transformed and selected on LB agar plates containing 100 
/xg/ml ampicillin (SIGMA) and amplified in LB liquid 
cultures at 3 0**G for 18-20 hours. 

Sequence analysis of 9 plasmids containing the S 
fragment (miniprep samples) and 9 plasmids containing the 
L fragment (maxiprep samples) were performed as described 
previously (Yanagi et al . , 1997). Three L fragments, each 
encoding a distinct polypeptide, were cloned into pCV-J4S9 

25 (which contained an S fragment encoding the consensus 

amino acid sequence of HC-J4) to construct three chimeric 
full-length HGV cDNAs (pGV-J4L2S, pGV-J4L4S and pGV-J4L6S) 
(Fig. 6) . Large scale preparation of each clone was 
performed as described previously with a QIAGEN plasmid 
Maxi kit (Yanagi et al . , 1997) and the authenticity of 
each clone was confirmed by sequence analysis. 
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Sequence Analysis 

Both strands of DNA were sequenced with the ABI 
PRISM Dye Termination Cycle Sequencing Ready Reaction Kit 
5 using Taq DNA polymerase (Perkin Elmer) and about 90 

specific sense and antisense primers. Analyses of genomic 
sequences, including multiple sequence alignments and tree 
analyses, were performed with GeneWorks (Oxford Molecular 
Group) (Bukh et al . , 1995). 

The consensus sequence of strain HC-J4 was 
determined by direct sequencing of PGR products (nts 11 - 
9412) and by sequence analysis of multiple cloned L and S 
fragments (nts 156 -9371). The consensus sequence of the 
15 3' UTR (3' variable region, polypyrimidine tract and the 
first 16 nucleotides of the conserved region) was 
determined by analysis of 24 cDNA clones. 

Intrahepatic Transfection Of A Chimpanzee 
20 With Transcribed RNA 



Two in vitro transcription reactions were 
performed with each of the three full-length clones. In 
each reaction 10 fig of plasmid DNA linearized with Xba I 

25 

(Promega) was transcribed in a 100 fil reaction volume with 
T7 RNA polymerase (Promega) at 3 7®C for 2 hours as 
described previously (Yanagi et al . , 1997). Five /xl of 
the final reaction mixture was analyzed by agarose gel 
30 electrophoresis and ethidium bromide staining (Fig. 5) . 
Each transcription mixture was diluted with 40 0 ^1 of 
ice-cold phosphate-buffered saline without calcium or 
magnesium and then the two aliquots from the same cDNA 
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clone were combined, immediately frozen on dry ice and 
stored at -80''C. Within 24 hours after freezing the 
transcription mixtures were injected into the chimpanzee 
by percutaneous intrahepatic injection that was guided by 
^ ultrasound. Each inoculum was individually injected (5-6 
sites) into a separate area of the liver to prevent 
complementation or recombination. The chimpanzee was 
maintained under conditions that met all requirements for 

10 its use in an approved facility. 

Serum samples were collected weekly from the 
chimpanzee and monitored for liver enzyme levels and 
anti-HCV antibodies- Weekly samples of 100 /xl of serum 
were tested for HCV RNA in a sensitive nested RT-PCR assay 
(Bukh et al., 1992, Yanagi et al . , 1996) with AmpliTaq 
Gold DNA polymerase. The genome equivalent (GE) titer of 
HCV was determined by testing 10 -fold serial dilutions of 
the extracted RNA in the RT-PCR assay (Yanagi et al., 

20 1996) with 1 GE defined as the number of HCV genomes 

present in the highest dilution which was positive in the 
RT-nested PCR assay. 

To identify which of the three clones was 
infectious in vivo, the NS3 region (nts 3659 - 4110) from 
the chimpanzee serum was amplified in a highly sensitive 
and specific nested RT-PCR assay with AmpliTaq Gold DNA 
polymerase and the PCR products were cloned with a TA 
cloning kit (Invitrogen) . In addition, the consensus 
sequence of the nearly complete genome (nts 11 - 9441) was 
determined by direct sequencing of overlapping PCR 
products . 

35 
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EXAMPLE 5 



Sequence Analysis Of Infectious Plasma Pool 
Of Strain HC-J4 Used As The Cloning Source 



As an infectious cDNA clone of a genotype la 
strain of HCV had been obtained only after the ORF was 
engineered to encode the consensus polypeptide (Kolykhalov 
et al., 1997; Yanagi et al., 1997), a detailed sequence 
10 analysis of the cloning source was performed to determine 
the consensus sequence prior to constructing an infectious 
cDNA clone of a lb genotype . 

A plasma pool of strain HC-J4 was prepared from 
acute phase plasmapheresis units collected from a 
chimpanzee experimentally infected with HC-J4/91 (Okamoto 
et al., 1992b) . This HCV pool had a PCR titer of 10* - 
10^ GE/ml and an infectivity titer of approximately 10^ 
chimpanzee infectious doses per ml. 

The heterogeneity of the 3' UTR of strain HC-J4 
was determined by analyzing 24 clones of nested RT-PCR 
product. The consensus sequence was identical to that 
previously published for HC-J4/91 (Okamoto et al., 1992b), 
25 except at position 9407 (see below) . The variable region 
consisted of 41 nucleotides (nts. 9372 - 9412), including 
two in-frame termination codons . Furthermore, its 
sequence was highly conserved except at positions 93 99 (19 
A and 5 T clones) and 9407 (17 T and 7 A clones) . The 
poly U-UC region varied slightly in composition and 
greatly in length (31-162 nucleotides) . In the conserved 
region, the first 16 nucleotides of 22 clones were 
identical to those previously published for other genotype 
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1 strains, whereas two clones each had a single point 
mutation. These data suggested that the structural 
organization at the 3' end of HC-J4 was similar to that of 
the infectious clone of a genotype la strain of Yanagi et 

5 

al (1997) . 

Next, the entire ORF of HC-J4 was amplified in a 
single round of long RT-PCR (Figure 5) . The original plan 
was to clone the resulting PGR products into the PinAI and 
10 Brf I site of a HCV cassette vector (pCV) , which had fixed 
5' and 3' termini of genotype la (Yanagi et al . , 1997) but 
since full-length clones were not obtained, two genome 
fragments (L and S) derived from the long RT-PCR products 
(Figure 6) were separately subcloned into pCV. 

To determine the consensus sequence of the ORF, 
the sequence of 9 clones each of the L fragment (pCV-J4L) 
and of the S fragment (pCV-J4S) was determined and 
quasispecies were found at 275 nucleotide (3.05 %) and 78 
20 amino acid (2.59 %) positions, scattered throughout the 
9030 nts (3010 aa) of the ORF (Figure 7). Of the 161 
nucleotide substitutions unique to a single clone, 71% 
were at the third position of the codon and 72 % were 
silent . 

Each of the nine L clones represented the near 
complete ORF of an individual genome. The differences 
among the L clones were 0.30 - 1.53% at the nucleotide and 
0.31 - 1.47% at the amino acid level (Figure 8). Two 
clones, LI and L7, had a defective ORF due to a single 
nucleotide deletion and a single nucleotide insertion, 
respectively. Even though the HC-J4 plasma pool was 
obtained in the early acute phase, it appeared to contain 

35 
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at least three viral species (Figure 9) . Species A 
contained the LI, L2, L6, L8 and L9 clones, species B the 
L3, L7 and LIO clones and species C the L4 clone. 
Although each species A clone was unique all A clones 
^ differed from all B clones at the same 20 amino acid sites 
and at these positions, species C had the species A 
sequence at 14 positions and the species B sequence at 6 
positions (Figure 7) . 
jQ Okamoto and coworkers (Okamoto et al., 1992b) 

previously determined the nearly complete genome consensus 
sequence of strain HC-J4 in acute phase serum of the first 
chimpanzee passage (HC-J4/83) as well as in chronic phase 
serum collected 8.2 years later (HC-J4/91) . In addition, 
they determined the sequence of amino acids 379 to 413 
(including HVRl) and amino acids 468 to 486 (including 
HVR2) of multiple individual clones (Okamoto et al., 
1992b) . 

20 It was found by the present inventors that the 

sequences of individual genomes in the plasma pool 
collected from a chimpanzee inoculated with HC-J4/91 were 
all more closely related to HC-J4/91 than to HC-J4/83 
(Ficrures 8, 9) and contained HVR amino acid sequences 
closely related to three of the four viral species 
previously found in HC-J4/91 (Figure 10) . 

Thus, the data presented herein demonstrate the 
occurrence of the simultaneous transmission of multiple 

30 species to a single chimpanzee and clearly illustrates the 
difficulties in accurately determining the evolution of 
HCV over time since multiple species with significant 
changes throughout the HCV genome can be present from the 



wo 99/04008 PCT/US98/14688 



- 57 - 

o 

onset of the infection. Accordingly, infection of 
chimpanzees with monoclonal viruses derived from the 
infectious clones described herein will make it possible 
to perform more detailed studies of the evolution of HCV 
^ in vivo and its importance for viral persistence and 
pathogenesis . 

EXAMPLE 6 

10 Determination Of The Consensus 

Sgguencf* Of HC-J4 In Tb *» Plasma Pool 

The consensus sequence of nucleotides 156-9371 
of HC- J4 was determined by two approaches . In one 
approach, the consensus sequence was deduced from 9 clones 
of the long RT-PCR product. In the other approach the 
long RT-PCR product was reamplified by PCR as overlapping 
fragments which were sequenced directly. The two 
"consensus" sequences differed at 31 (0.34%) of 9216 
20 nucleotide positions and at 11 (0.37%) of 3010 deduced 
amino acid positions (Figure 7) . At all of these 
positions a major quasispecies of strain HC-J4 was found 
in the plasma pool. At 9 additional amino acid positions 
the cloned sequences displayed heterogeneity and the 
direct sequence was ambiguous (Figure 7) . Finally, it 
should be noted that there were multiple amino acid 
positions at which the consensus sequence obtained by 
direct sequencing was identical to that obtained by 
cloning and sequencing even though a major quasispecies 
was detected (Figure 7) . 

For positions at which the two "consensus" 
sequences of HC-J4 differed, both amino acids were 
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included in a composite consensus sequence (Figure 7) . 
However, even with this allowance, none of the 9 L clones 
analyzed (aa 1 - 2864) had the composite consensus 
sequence: two clones did not encode the complete 
^ polypeptide and the remaining 7 clones differed from the 
consensus sequence by 3 - 13 amino acids (Figure 7) . 

EXAMPLE 7 

10 Construction Of Chimeric Full -Length cDNA 

Clones Containing The Entire ORF Of HC-J4 

The cassette vector used to clone strain H77 was 
used to construct an infectious cDNA clone containing the 
ORF of a second subtype. 

15 

In brief, three full-length cDNA clones were 
constructed by cloning different L fragments into the 
PiziAl/Bglll site of pCV-J4S9, the cassette vector for 
genotype la (Figure 6) , which also contained an S fragment. 

20 encoding the consensus amino acid sequence of HC-J4. 

Therefore, although the ORF was from strain HC-J4, most of 
the 5' and 3' terminal sequences originated from strain 
H77. As a result, the 5' and 3' UTR were chimeras of 

25 genotypes la and lb (Figure 11) . 

The first 155 nucleotides of the 5' UTR were 
from strain H77 (genotype la) , and differed from the 
authentic sequence of HC-J4 (genotype lb) at nucleotides 
11, 12, 13, 34 and 35. In two clones (pCV-J4L2S, pGV- 

30 

J4L6S) the rest of the 5' UTR had the consensus sequence 
of HC-J4, whereas the third clone (pCV-J4L4S) had a single 
nucleotide insertion at position 207. In all 3 clones the 
first 27 nucleotides of the 3' variable region of the 3' 

35 
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UTR were identical with the consensus sequence of HC-J4. 
The remaining 15 nucleotides of the variable region, the 
poly U-UC region and the 3' conserved region of the 3' UTR 
had the same sequence as an infectious clone of strain H77 

5 

(Figure 11) . 

None of the three full-length clones of HC-J4 
had the ORF composite consensus sequence (Figures 7, 12) . 
The pC:v-J4L6S clone had only three amino acid changes: Q 
10 for R at position 231 (El) , V for A at position 937 (NS2) 
and T for S at position 1215 (NS3) . The pCV-J4Ii4S clone 
had 7 amino acid changes, including a change at position 
450 (E2) that eliminated a highly conserved N- linked 
glycosylation site (Okamoto et al . , 1992a). Finally, the 
pCV-J4L2S clone had 9 amino acid changes compared with the 
consensus sequence of HC-J4. A change at position 304 
(El) mutated a highly conserved cysteine residue (Bukh et 
al., 1993; Okamoto et al . , 1992a). 
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EXAMPLE 8 

Transfection Of A Chimpanzee By In 
Vitro Transcri pts Of A Chimeric cDNA 

The infectivity of the three chimeric HCV clones 
was determined by ultra-sound-guided percutaneous 
intrahepatic injection into the liver of a chimpanzee of 
the same amount of cDNA and transcription mixture for each 
of the clones (Figure 5) . This procedure is a less 
invasive procedure than the laparotomy procedure utilized 
by Kolykhalov et al . (1997) and Yanagi et al. (1997) and 
should facilitate in vivo studies of cDNA clones of HCV in 
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chimpanzees since percutaneous procedures, unlike 
laparotomy, can be performed repeatedly. 

As shown in Figure 13 , the chimpanzee became 
infected with HCV as measured by increasing titers of 10^ 
GE/ml at week 1 p.i., 10^ GE/ml at week 2 p.i. and 10^ - 
10^ GE/ml during weeks 3 to 10 p.i. 

The viremic pattern found in the early phase of 
the infection was similar to that observed for the 
10 recombinant H77 virus in chimpanzees (Bukh et al., 

unpublished data; Kolykhalov et al., 1997; Yanagi et al . , 
1997) • The chimpanzee transfected in the present study 
was chronically infected with hepatitis G virus (HGV/GBV- 
C) (Bukh et al., 1998) and had a titer of 10^ GE/ml at the 

15 

time of HCV transf ection . Although HGV/GBV-C was 
originally believed to be a hepatitis virus, it does not 
cause hepatitis in chimpanzees (Bukh et al . , 1998) and may 
not replicate in the liver (Laskus et al., 1997). The 

20 present study demonstrated that an ongoing infection of 
HGV/GBV-C did not prevent acute HCV infection in the 
chimpanzee model . 

However, to identify which of the three full- 
length HC-J4 clones were infectious, the NS3 region (nts. 
3659 - 4110) of HCV genomes amplified by RT-PCR from serum 
samples taken from the infected chimpanzee during weeks 2 
and 4 post-infection (p.i.) were cloned and sequenced. As 
the PCR primers were a complete match with each of the 

30 original three clones, this assay should not have 

preferentially amplified one virus over another. Sequence 
analysis of 26 and 24 clones obtained at weeks 2 and 4 
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p.i., respectively, demonstrated that all originated from 
the transcripts of pCV-'J4Ij6S. 

Moreover, the consensus sequence of PGR products 
of the nearly complete genome (nts. 11-9441), amplified 
^ from serum obtained during week 2 p.i., was identical to 
the sequence of pCV-J4L6S and there was no evidence of 
quasispecies. Thus, RNA transcripts of pCV-J4L6S, but not 
of pCV-J4L2S or pCV-J4L4S, were infectious in vivo. The 
10 data in Figure 13 is therefore the product of the 
transfection of RNA transcripts of pCV-J4L6S- 

In addition, the chimeric sequences of genotypes 
la and lb in the UTRs were maintained in the infected 
chimpanzee. The consensus sequence of nucleotides 11 - 
341 of the 5' UTR and the variable region of the 3' UTR, 
amplified from serum obtained during weeks 2 and 4 p.i., 
had the expected chimeric sequence of genotypes la and lb 
(Fig. 11) . Also three of four clones of the 3' UTR 
obtained at week 2 p.i. had the chimeric sequence of the 
variable region, whereas a single substitution was noted 
in the fourth clone. However, in all four clones the poly 
U region was longer (2-12 nts) than expected. Also, extra 
C and G residues were observed in this region. For the 
most part, the number of C residues in the poly UC region 
was maintained in all clones although the spacing varied. 
As shown previously, variations in the number of U 
residues can reflect artifacts introduced during PGR 
amplification (Yanagi et al . , 1997). The sequence of the 
first 19 nucleotides of the conserved region was 
maintained in all four clones. Thus, with the exception 
of the poly U-UC region, the genomic sequences recovered 
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from the infected chimpanzee were exactly those of the 
chimeric infectious clone pCV-J4BL6S, 

The results presented in Figure 13 therefore 
demonstrate that HCV polypeptide sequences other than the 
^ consensus sequence can be infectious and that a chimeric 
genome containing portions of the H77 termini could 
produce an infectious virus. In addition, these results 
showed for the first time that it is possible to make 
10 infectious viruses containing 5' and 3' terminal sequences 
specific for two different subtypes of the same major 
genotype of HCV. 
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EXAMPLE 9 

Construction Of A Chimeric 
la/lb Infectious Clone 



A chimeric la/lb infectious clone in which the 
structural region of the genotype lb infectious clone is 
inserted into the la clone of Yanagi et al . (1997) is 
constructed by following the protocol shown in Figure 15. 
The resultant chimera contains nucleotides 156-2763 of the 
lb clone described herein inserted into the la clone of 
25 Figures 4A-4F. The sequences of the primers shown in 
Figure 15 which are used in constructing this chimeric 
clone, designated pH77CV~J4, are presented below. 
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1. H2751S (Cla l/Nde I) 

CGT CAT CGA TCC TCA GCG GGC ATA TGC ACT GGA CAC GGA 

2- H2870R 

CAT GCA CCA GCT GAT ATA GCG CTT GTA ATA TG 

3 . H7851S 
TCC GTA GAG GAA GCT TGC AGC CTG ACG CCC 

4 . H9173 R(P-M) 
GTA CTT GCC ACA TAT AGC AGC CCT GCC TCC TCT G 

5. H9140S (P-M) 
CAG AGG AGG CAG GGC TGC TAT ATG TGG CAA GTA C 

6 . H9417R 

CGT CTC TAG ACA GGA AAT GGC TTA AGA GGC CGG AGT GTT 
TAC C 

7. J4-2271S 
TGC AAT TGG ACT CGA GGA GAG CGC TGT AAC TTG GAG 

8 . J4-2776R (Nde I) 

CGG TCC AAG GCA TAT GCT CGT GGT GGT AAC GCC AG 

Transcripts of the chimeric la/ lb clone (whose 
20 sequence is shown in Figures 16A-16F) are then produced 

and transf ected into chimpanzees by the methods described 
in the Materials and Methods section herein and the 
transf ected animals are then be subjected to biochemical 
(ALT levels) , histopathological and PCR analyses to 
determine the infectivity of the chimeric clone. 

EXAMPLE 10 

Construction of 3' Deletion Mutants 
30 Of The la Infectious Clo ne r)CV-H77C 

Seven constructs having various deletions in the 
3' untranslated region (UTR) of the la infectious clone 
pCV-H77C were constructed as described in Figures 17A-17B, 
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The 3' untranslated sequence remaining in each of the 
seven constructs following their respective deletions is 
shown in Figures 17A-17B. 

Construct pCV-H77C ( -98X) containing a deletion 
of the 3'-most 98 nucleotide sequences in the 3'-UTR was 
transcribed in vitro according to the methods described 
herein and 1 ml of the diluted transcription mixture was 
percutaneously transfected into the liver of a chimpanzee 
with the aid of ultrasound. After three weeks, the 
transfection was repeated. The chimpanzee was observed to 
be negative for hepatitis C virus replication as measured 
by RT-PCR assay for 5 weeks after transfection. These 
results demonstrate that the deleted 98 nucleotide 3'-UTR 
sequence was critical for production of infectious HCV and 
appear to contradict the reports of Dash et al. (1996) and 
Yoo et al. (1995) who reported that RNA transcripts from 
cDNA clones of HCV-1 and HCV-N lacking the terminal 98 
conserved nucleotides at the very 3' end of the 3'-UTR 
resulted in viral replication after transfection into 
human hematoma cell lines. 

Transcripts of the (-42X) mutant (Figure 17C) 
were also produced and transfected into a chimpanzee and 
transcripts of the other five deletion mutants shown in 
Figures 17D-17G) are to be produced and transfected into 
chimpanzees by the methods described herein. All 
transfected animals are to then be assayed for viral 
replication via RT-PCR. 
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Discussion 

In two recent reports on transfection of 
chimpanzees, only those clones engineered to have the 
5 independently determined and slightly different consensus 
amino acid sequence of the polypeptide of strain H77 were 
infectious (Kolykhalov et al . , 1997; Yanagi et al . , 1997). 
Although the two infectious clones differed at four amino 
acid positions, these differences were represented in a 
major component of the quasispecies of the cloning source. 
In the present study, a single consensus sequence of 
strain HC-J4 could not be defined because the consensus 
sequence obtained by two different approaches (direct 
15 sequencing and sequencing of cloned products) differed at 
20 amino acid positions, even though the same genomic PGR 
product was analyzed. The infectious clone differed at 
two positions from the composite amino acid consensus 
seauence, from the sequence of the 8 additional HC-J4 

20 

clones analyzed in this study and from publxshed sequences 
of earlier passage samples. An additional amino acid 
differed from the composite consensus sequence but was 
found in two other HC-J4 clones analyzed in this study. 

25 The two non-infectious full-length clones of HC-J4 

differed from the composite consensus sequence by only 7 
and 9 amino acid differences. However, since these clones 
had the same termini as the infectious clone (except for a 

30 single nucleotide insertion in the 5' UTR of pCV- J4L4S) , 

one or more of these amino acid changes in each clone was 
apparently deleterious for the virus. 
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It was also found in the present study that HC- 
JA, like other strains of genotype lb (Kolykhalov et al., 
1996; Tanaka et al . , 1996; Yamada et al . , 1996), had a 
poly U^UC region followed by a terminal conserved element. 
^ The poly U-UC region appears to vary considerably so it 

was not clear whether changes in this region would have a 
significant effect on virus replication. On the other 
hand, the 3' 98 nucleotides of the HCV genome were 
10 previously shown to be identical among other strains of 
genotypes la and lb (Kolykhalov et al., 1996; Tanaka et 
al., 1996). Thus, use of the cassette vector would not 
alter this region except for addition of 3 nucleotides 
found in strain H77 between the poly UC region and the 3' 
98 conserved nucleotides. 

In conclusion, an infectious clone representing 
a genotype lb strain of HCV has been constructed. Thus, 
it has been demonstrated that it was possible to obtain an 
infectious clone of a second strain of HCV. In addition, 
it has been shown that a consensus amino acid sequence was 
not absolutely required for infectivity and that chimeras 
between the UTRs of two different genotypes could be 
viable. 
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WHAT IS CLAIMED IS: 

1. A purified and isolated nucleic acid 
molecule which encodes human hepatitis C virus, said 
molecule capable of expressing said virus when transfected 
into cells. 

2. The nucleic acid molecule of claim 1, 
wherein said molecule encodes the amino acid secpjence 

10 shown in Figures 14G-14H. 

3. The nucleic acid molecule of claim 2, 
wherein said molecule comprises the nucleic acid sequence 
shown in Figures 14A-14F. 

4 . The nucleic acid molecule acid molecule of 
claim 1, wherein said molecule encodes the amino acid 
sequence shown in Figures 4G-4H. 

5. The nucleic acid molecule of claim 4, 
wherein said molecule comprises the nucleic acid sequence 
shown in Figures 4A-4F. 

6. The nucleic acid molecule of claim 1, 
wherein a fragment of said molecule which encodes the 
structural region of hepatitis C virus has been replaced 
by the structural region from the genome of another 
hepatitis C virus strain. 

7. The nucleic acid molecule of claim 6, 
wherein said molecule encodes the amino acid sequence 
shown in Figures 16G-16H. 

8. The nucleic acid molecule of claim 7, 
wherein said molecule comprises the nucleic acid sequence 
shown in Figures 16A-16F. 
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9. The nucleic acid molecule of claim 1, 
wherein a fragment of the nucleic acid molecule which 
encodes at least one HCV protein has been replaced by a 
fragment of the genome of another hepatitis C virus strain 

^ which encodes the corresponding protein. 

10. The nucleic acid molecule of claim 9, 
wherein the protein is selected from the group consisting 
of El, E2 or NS4 proteins. 

10 11- The nucleic acid molecule of claim 1, 

wherein a fragment of the molecule encoding all or part of 
an HCV protein has been deleted. 

12. The nucleic acid molecule of claim 11, 
wherein the HCV protein is selected from the group 
consisting of P7, NS4B or NS5A proteins. 

13. A DNA construct comprising a nucleic acid 
molecule according to claims 1, 2, 5 or 8 • 

14 . An RNA transcript of the DNA construct of 

20 claim 13 . 

15. A cell transfected with the DNA construct 
of claim 13 • 

16 . A cell transfected with RNA transcript of 

claim 14 . 

17. A hepatitis C virus polypeptide produced by 
the cell of claim 15. 

18. A hepatitis C virus polypeptide produced by 
the cell of claim 16 . 

30 19. A hepatitis C virus produced by the cell of 

claim 13 . 

20. A hepatitis C virus produced by the cell of 

claim 14 . 
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21. A hepatitis C virus whose genome comprises 
a nucleic acid molecule according to claims 1, 3, 5, G, 8, 
or 9, 

22. A method for producing a hepatitis C virus 
^ comprising transfecting a host cell with the RNA 

transcript of claim 14 . 

23 . A polypeptide encoded by a nucleic acid 
sequence according to claims 1, 2, 4 or 7 or a fragment 

10 thereof . 

24. The polypeptide of claim 23, wherein said 
polypeptide is selected from the group consisting of NS3 
protease. El protein, E2 protein or NS4 protein. 

25. A method for assaying candidate antiviral 
agents for activity against HCV, comprising 

a) exposing a cell containing the hepatitis C 
virus of claim 21 to the candidate antiviral agent; and 

b) measuring the presence or absence of 

20 hepatitis C virus replication in the cell of step (a) . 

26. The method of claim 25, wherein said 
replication in step (b) is measured by at least one of the 
following: negative strand RT-PCR, quantitative RT-PCR, 
Western blot, immunof luoresence, or infectivity in a 
susceptible animal . 

27. A method for assaying candidate antiviral 
agents for activity against HCV, comprising: 

a) exposing an HCV 
30 protease encoded by a nucleic acid 

sequence according to claims 1, 2, 4, 
or 7, or a fragment thereof to the 
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candidate antiviral agent in the 
presence of a protease substrate; and 
b) measuring the protease 
activity of said protease. 
^ 28. The method of claim 21, wherein said HCV 

protease is selected from the group consisting of an NS3 
domain protease, an NS3-NS4A fusion polypeptide, or an 
NS2-NS3 protease. 
10 29. An antiviral agent identified as having 

antiviral activity for HCV by the method of claim 25. 

30. An antiviral agent identified as having 
antiviral activity for HCV by the method of claim 27. 

31. Antibody to the polypeptide of claim 23. 
32- Antibody to the hepatitis C virus of claim 

21. 

33 . A method for determining the susceptibility 
of cells in vitro to support HCV infection, comprising the 

20 steps of : 

a. growing animal cells in 

vitro; 

b. transfecting into said 
cells the nucleic acid of claim 1; and 

25 

c. determining if said 
cells show indicia of HCV replication. 

34, The method according to claim 33, wherein 
said cells are human cells. 

30 35. A cassette vector for cloning viral 

genomes, comprising, inserted therein, the nucleic acid 
sequence according to claim 2, said vector reading in the 
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correct phase for the expression of said inserted sequence 
and having an active promoter sequence upstream thereof. 

36, The cassette vector of claim 35, wherein 
the cassette vector is produced from plasmid pCV. 
^ 37- The cassette vector of claim 35, wherein 

the vector also contains one or more expressible marker 
genes . 

38. The cassette vector of claim 35, wherein 

10 the inserted DNA sequence contains at least one ORF of the 
HCV genome from any strain. 

39. The cassette vector of claim 35, wherein 
the promoter is a bacterial promoter. 

40. A composition comprising a polypeptide of 
claim 23 suspended in a suitable amount of a 
pharmaceutically acceptable diluent or excipient . 

41. A method for treating hepatitis C viral 
infection comprising the administration to a animal in. 

20 need thereof of a clinically effective amount of the 
composition of claim 40. 

42. A composition comprising a nucleic acid 
molecule of claim 1 suspended in a suitable amount of a 
pharmaceutically acceptable diluent or excipient. 

43 . A method for treating hepatitis C viral 
infection comprising the administration to an animal in 
need thereof of a clinically effective amount of the 
composition of claim 42 . 
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QOCAGQaocc iGAiaaaaac gacactocac cAiGAMcac tocdciisiua. so 

QGAACiaCIG TCTICRCrXA GAAAQOSICT PUJJW3J3 TilAUimUfiG 100 

' lUlUjlG C a G CX:?3XraQGftC aXDXICCC OOGAGftGOCA. TRGlUmUlU 150 

C33GAADa33fr GAGI30^C3CG GAATIOaCAG GA03ACaQ3S TOCmCTIG 200 

GATftAAOCOS ClCAAIOaCT QGAGATnOG GOGflQCXXXX: QC3^AGACIQC 250 

TAGOOGAGTCA (jlUi ' lUUIC QCISAfiGQOC: TIGIQCSIACr QCXnGMaOS 300 

CilULTlUULA 0103X0303 AOOICIOSm GftOJLiiOCAC CftIIGftGCa C3S 350 

AMCCIAAAC dCAAAGAAA AWDCS^AftOSr AACAGCftACJC wivjuuujACA 400 

GGftOSICftftG Ti U U U UJLjI G GGSGICaGftT OJl'lUJiGSl (JiTlift L'i'iU r 450 

TOCJOQCXSCAG GGUULUrAGA. TI03GIGI0C: GCX3CX3ACX3AG GAAGAL'i'iUJ 500 

GAGaQ3ICX3C AAOCIOGAGS TAGROGTCfiG aJimCCOZA. AOaCACGTOG 550 

GCXXrSAGQQC AGQA0CT033 CICAQCTOaG GIRQGCnQS Cr XXnCIKIG 600 

QCAAIGAQ3S TiaCX333IQ3 QOaOSAIOQC TDCTOTCTOC CXUi UX.'lL T 650 

cGQCxnaQcr oaaGoaccAC AGAOcmDos cxnaoo im: g^cttoqs 700 

lAAGOICATC GfiTR OJLTlA GGIQCX33CTT OaOOGACCIC AIG33GEACA 750 

•nmacroor ajuuGccocr ctiogaqgcjs cioozag33c: C3ciq 30qcat boo 

G3CGnX03G TIUIQGAAG?V CX33CXJrGAAC TATOCAACAG QGAAQCTiaC 850 

lOGnoncr 'i'luiuimcr tccticiqgc octoctctct tgoct gacig 900 

•lOCCXaCTIC AGCCEAOCAA. GiaCX9C»ATr CCia3333CT T33^CCAIGIC 950 

KX3\AIGATr GCCCTD^ACIC GAGTATIOIG T703AGaCX3G OOGATOOCAT 1000 

GCIQCACACr CLUjmiOIG 'iUUL'l'lG GO r TOOOGAOOT AAOGOCTDGA 1050 

OGIGnGOGr GQOQGIGACC OOCAUULjl O^ OCACCAG33A. CX3QC3^AACIC 1100 

CCCACAAOSZ AanroSAOG TCATAOIDGAT CiULl'iUICG G3?m3aCAC 1150 

OJiUiGCICG GCCCICrAOG TOGGGGACCT GIQ3333ICr GICnTCTIG 1200 

TIOGICAACr Gi ' l ' lAOJi ' lL: TCiaaCW33C: QCX^CIOGAC GACOGAAGAC 1250- 

IQCAATIOrr CimUllAilCC CXaOOCATAaA AOGOGrDMC OCaiGQCKIG 1300 

GGAaaaOATC AIGAACIOCST (XCXnaoaGC MOOnOSIO OliACCICAGC 1350 

TOdCaOGAT GaCAC3«GCS: MCAIOGACA TGALlUJL'iLG T3CTCACI03 1400 

QGAGICCIOG CX333CATACX: GTATTICia: AIOGia3G3A ACIQQQCEAA 1450 

GGICCIGGm Gi U L ' iU L'IOC TA l ' l ' lUJUJG (JLJiLGSmDS GAAAOOCAOG 1500 

1CAOCX333GG AAAIQCT3Z a3CAOCAa33 CiUJJLTlU r 'imiUHXTT 1550 

ACACEAQQCG CCAAGCAGAA. CATOCAACIG AICAACACCA AOGGCAGTIG 1600 

QCaCATCAAT AQCACOaacr TGAATIOCAA TGAAAOOTT AACADraOCT 1650 

QGTMCAOG QCICriCIAT CS^ACACAAAT TCAACICITC AGGCIOroCT 1700 

GAGAG3n03 CCAGCIGXG ACXSCCirAO: GAli'ITiGOX Afl3C3CIQa^ 1750 

TCCTATCAGT TATGCTSAQG GAAGOQOOCT CX3PCGAACGC CQC^TOCT 1800 

03CACrCACCC TCCAAGACCr TGIQQCATIG TaC003C3\AA GAQOGIGIOr 1850 

GC3CCCG3EAT ATIGCTICAC TOOCAGOOOC 0103100100 GAACGA003A 1900 
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CAQsiom: aaoociAarr acagciggcG iqcaaaigat AoasAiGicr 1950 

TOsrocrmA ca^ftCAocftos cxaoo3CiQG acs^'iuarr ourium x 2000 

T3GKIGAACr CAACIGGATT C3m^AAGIG IGOGGftGOGC GGULTiUlUr 2050 

CAI03GAGQG GI033CAACA AGA ULTIULT CIQaoaCACT Gft l'iUCTlUL ' 2100 

GCAAACAICC QSAftQ0C3O^ TftCICIOasr GCISaCTCXDQG TOaCIQGftTT 2150 

ACAGOCaasr GCAIQSiaSA CTACXXGIM' AGXTnOQC ACEATOCTIG 2200 

TRC3CftICAAT •DOCCS^ajff TCAAAGTCfiG GMX3I3V03IG GGftOGGSIOj 2250 

AQCACAG3Cr GaAftGOaQCX: IQCS^ACIGGA. O303GGG03A. AOXTCTGAT 2300 

CIQGAAGACA G9GRCAG3IC OSftGCICfiGC ULUi'iUL'iUL' lUlULaOJAC 2350 

ACAGiQocaG (jiujriumr GfircmcAC GAaacioacA qxtigicxa 2400 

00Q3CCICAT OCADCroCRC C2^3AACATIG TOGADSIOCA GEOTSraC 2450 

QaQGnaGQGfr CAAQCa3X3aC G'iUL'iUGGG C AT13^ftGiaQG AG13VCX3ia3r 2500 

IL ' iLUlUi'lU CriUlULTlU CfiGAOaCOCG GSICIQCia: 'iULTiUiUA 2550 

TCaroiTACT omiraxsA oaasAGQaaG ctttiogagaa cu i aG?i ft MA 2600 

CrCAATOCAG CATCOCIQGC OSXS^HSZPC OJiUi'lUiUi' ULTiUCiUJi' 2650 

(ji ' iL'l ' iUlU C T i'iUUGll3 CT ATCIGAAG33 TAGGFIQ33IG CX:033ftG333 2700 

TCEAOsaDCT cTAoaoGAiG T33cxncia: TccnxiTicr onaaoGTiG 2750 

CXTCAQ0GQ3 CAIAD3CACr OGACAOSG?^ GIt33aC3303r CDGIGIOaCGS 2800 

Cm ' iUriL ' i ' i ' GTOOOCTIAA TaOOariGAC TCIGTCQOCA TATIACAftGC 2850 

GCUKCATCAG CIGGTIGCaiG 'iUUiUJL'i'iL' ALJim'iTiLT GACXSGftGIA 2900 

GAAQOQCAAC TQCftOCSIGIG QCSTroOOOOC CICAAGGJICC GCmmSOS 2950 

CX3AIQ00GIC AIILTIIACICA. TSIGIGERGT ACA0033ADC CIOSIMTIG 3000 

ACATCAC3CAA ACEACICCIG GOCALLL'i'iULi GAOOCTTIG GmCTICAA 3050 

QgCA Gl ' i ' lU C TEAAJOXXr CI^L'i'iUUiU CDQUJi'lUAAG GOJi'iUiUUU 3100 

GATCroaOOG CrAQOOaQGA AGMCAQCX33G A33ICATEAC GTOCAAAIOG 3150 

aZATCAICAA 0113^033303 Cnail'iGQCA OCTAliUiUlA IMOCKICIC 3200 

PCCCCICnC GAGACIOaSC QCACAAa33C CiaOSAGATC TOSCrXSiaSC 3250 

TCIGSAACCA (JimiL ' i ' lLT COOGAAIOSA. GfiO^^AGCIC MCWXflQGG 3300 

GQQCAGMs^ omjumo: asiGftCKiCA. icAAOSscrr oooosicicr 3350 

QQOQOn^GQG OraGGftGRT A L ' iUL ' i ' lUUli CX^aaOSROS GAMQGfldC 3400 

CAAQQ33IGG AGSTIGCIGS aaOOCaTCAC GSOGTC^aSOC: CAGCaGACXSA. 3450 

GftGXdOCr AQOGIGIMA imiPCOPOOZ TSACIGSODS G3ACAAAAAC 3500 

CAAGIOGAQS GIGAGSICCA GAlOJiUiC A ACIQCTAOOC AAAOCTIOCr 3550 

aSCAAOSIQC AICAAIG333 TAIGCIG3f^ 'iUiL'ilACCAC G3Q3a03SAA. 3600 

GGA03AOCAT OSZAICAOCX: AAG33IOCIG TCATOCAGAT GEKTAOCAAT 3650 

GIQGADCAAG ACCTIGIG33 Cia3CXDarT 0CTCA?O3IT CXJGLCICATr 3700 

GACADOCIGT ACCTaCX33Cr CCTCQSAOCT TEACCIGGnC AOaftQaCACG 3750 

OGCaroiCAT 'iOXGIGOSC CQSQGAGGrCG A3MC»0333 TftGOCIQCTT 3800 
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1033300332 OC mTiUClA CTIGAAAOSC: 1!OOin33333 UiULUL'iUi'i' 3850 

GD33333303 aGRC2mXX3 'lUUUJL'imT CAGGGO CGOG GIVJI UJAOCX: 3900 

GIOGAGIQQC TS^AftODOaiG GACITIIKIGC CIGIQGAGAA OriMGSRCA 3950 

AOCKrC»GAT C00CX3GIGIT GRaOGRCAftC lUL'iCKCAC CaOCaGIOOC 4000 

craGAGcnc caGcsioaaa: aocigcaigc Toocsmaoc msosjtmg^ 4050 

GCACCAAOGfT OCmXTOCE TAO3CfiG0X AOJJL'JiACAA GSIGTIGGIG 4100 

CICAACDOCT CiUriUL ' IGC AAaJL'iOGQC 'ITl UJiU- Tr ACAlGiUJAA 4150 

G3LOC fi IG33 GTIGAIOCIA MMCftGG»C (JLL UJiGAGA. iO^ftTiaiJCR . 4200 

CIGGC A GOCX: C3m3^0GnaC TOCftCCEROS GCMGnCCT TOOOaftCQQC 4250 

OaGIQCICAG GAQSIQCTIA. TGPCKDVMA ATnGIGROS AGflOOCAClC 4300 

CAOOGATOCC ACMXTATCT T333CAia33 CACiUiUL'iT GACCAftOCaG 4350 

AGACIGa3QC3 Q30GAGACIG U l' lUilX T CG (XACiaCIAC COTOGOOaC 4400 

TCCX3ICACIG 'lUICCCATCC TAACAOXDGfiG GAG3ITQCIC T3IDCACCAC 4450 

OQGAGAGAIC aCCTITTADS CCM^£330mr OJUX'IOSftG GIGftTC AftGG 4500 

TCICftTCTIC TQaC3CT3^A ^GAAGAAGIG CX3A03AGCIC 4550 

GC!CaC33AAOC: IQCSiaSCATT GOGCAICAM? UULUi G 3C CT ACIftgJQOG S 4600 

ICnGAOSIG TCIGnCATOC CGAQCaGOOS OGAliUi'iG'iC GTGGItJTOGA 4650 

ccxsftrocicr caigacig3c TTiAooaacG acttogacic igigat agac 4700 

TGCAACACGT GIGICACICA GACAGIOGAT TICAQ2CTIG ACOCEACCIT 4750 

TAGCATIGAG ACAACCAD3C TCOCCCAGGA T3L'iUiCICC AQGACTCAAC 4800 

QCQQCaaaCAG GACIGQCAOG oaGAfiaXSG OCMCTmsG MTIGiaaCA 4850 

CQQ333GPiQC 0000010332 CATOITaGAC ICGflOCXJEDC ICIGIGftGIG 4900 

CEMGAOQCXS QCSCIGIGCTT QGIPilGftOCT CNJ33333^ G^GACEACAG 4950 

TEAGOCEAOG AOaSTACATC P>P€PO30332 Q30TT0003T GIQCXfiOGAC 5000 

CAICriGAAT Tna33AG33 ajrCTITSCG O30CrrCACIC ATATfi GftlGC 5050 

CXU ^ ' i ' i ' i ' i'iA TCOCAGACAA. AQCAGAGIQ3 GGPGAAdTT OCTPOC IGG 5100 

TAGDGTACJCA AQCXS^DOSIG T3D3CrW333 CICAfiGOOOC ' HXma TOG 5150 

1Q33A0CAGA TGIQGAAGIG TTIGAIOCrx: CITAAAOaCA OJJiOIMOS 5200 

ODCAACAOaC CIGCnSOACA GACiaaOCDC TCITCZOAT GAAGflCAOOC: 5250 

IGAaX^ODC AftTCACCAAA TACA3CAIGA CAIQCAIGIC OOQCXSAOCIG 5300 

GAQ3ID3ICA OSAGCACCIG GSIGCiaGIT 033330310: TQGCIGZTCT 5350 

Q3CX3333rAr TOQCIGICPA Cj^G3L' lUCCT GGICAIfiGIG GSCaOS ftTCG 5400 

' iUi'lUlO LG G GAAQGCX93CA ATEfOaonG AOGGGMGT TCICI3\aaG 5450 

GAGTICGAIG GIOCICICPG GACTEADOGT ACMCafflG^ 5500 

AG33AIGKIG CTOQCIGAQC PGTTCM^ GA^GSXCIC OQCXnOCIGC 5550 

AGAOGGGGIC (DCGOCAIGCA GAQGTTAICA CX_U-'iU_'iGT OCAGPCCAAC 5600 

TGQCAGAAAC TCGAG3ICIT Tia330GA^G C^CATOIGGA. MTICMCAG 5650 

TOGGATACAA TACnG3CX33 GGCIGTCAW: aCKTmOSr AAGXaOCJCA. 5700 

FIG. 4C 



JSOOCID: <WO_9904008A2_I^> 



SUBSTITUTE SHEET (RULE 26) 



wo 99/04008 



PCT/US98/14688 



7/A9 

H77C 

10 20 30 40 50 

1234567890 1234567890 1234567fl90 1234567890 1234567flQ0 

TiULTiUATT GAlULdLTi'IT ACRQCT3a03 TCACCa^QOOC ACEAAaaO" 5750 

GOaCAAAODC TOCICTICAA (30^110933 GJGi X UamU i CX G GULJfl GCT 5800 

caoo3ccooc OGfiGococm ciaacrriGr cojioi'jm: c iaa ,' i ujm 5850 

OOQCXZAroaG CAGOdnOGA. Cia33GAAGG TOCIOSIQGA CAI IUi ' iULA 5900 

OQSIMraaOS CQGGOJiUUL" GGGAUL'iUiT GiaOMTCft. AGAICftlGRG 5950 

aSGIGAGGTC GOCiaCAOaS AQSAOCIQGfr CAftlC IG CI G aXOaZASDC 6000 

' iUiuuuL ' iUL? AJUuLTiuiA uiOjCjiGia:;; iciGoocaGc Mm^nxsasz eoso 

oaacg ojriu ggocxsqgoga. GoosacAGnEG caa!iggaiga acqqxtaat 61oo 

AG ULTILU OC TOCXX3333GA. ALXJALLUiTlU CmZAOGCS^ T3^CGm3aC33G 61S0 

AGAGGGA3QC AQC3030O33C GICACiGGCA TACICAGZRG GCTC A L'lUJ A 6200 

ACOCAGCIDC: TGAG30GIACr OZATCAGflQS ATAAGCn033 AGflGE^OCftC 6250 

TOCAIGCTECC OGfTTOCIOGC TAAG33ACAT CIQ33ACIG3 AlftlQCXSAGS 6300 

1GCI G AQ03A. CmS^AGAO: TQ3CIGAAAG OCAAGCICAT GOCACAACIG 6350 

cc'XGG G ftnc ojmuiuiL ' ciaacftooGC ooGfiamQacs oG g ic'iajm 6400 

AQGAGACSX: Ai'JmUCACA. CIGQCIGGCA CiGiUtaAQCT GAGA!ICACIG 6900 

GACATCICAA AAACXaOGADS AIGAQGAIDG lULUiOJllAG GAOITOCaGS 6950 

AACA.TGTGGA GraOGAOGIT CXDOCATIRAC CCCmZNDOA CX33GCDOC3CIG 6550 

TAL'iUJOJi'i' anOOQCXXSA. ACTATA^CTT OGCDCIGIQS AOaSIGICIG 6600 

CAGAQGAMA OGTOGAGAIft. ^000333103 G33ftL'i'iULA CTACSEftTO3 6650 

GGTAIGAC3A CIGACAftTCT TAAAiaOOOS TQOCaGAIOC CATOSOCXDGA 6700 

Ai'i ' i'l ' HJ ACA. GAftnOGADS Q3GIQ030CT ACACAGGnT GGGUOJUCiT 6750 

QCAAGCUl'l T GJIQOGQGAG GAQ3IAICAT TCAGAGE^ ACIDCAOSAG 6800 

TAC0033IG3 GGI03CAATr AGCTIGC3GAG O303AAO333 A03IAGG03r 6850 

GTIGACXICDC ATOCICACIG AICCCBXCA. TAIAACAGCA GAG303300G 6900 

QGAGAAQGIT OGOGAGAGGG TCALXJUULTi' CTAIQ33CAG CiLUlLUJLT 6950 

AGCXaOCIGT OOQCTOCaax: TCICAAOaCA. ACnOCAGOS OCAADCaaGA. 7000 

CIGOOCIGAC aOOSAGCKA mSAGGCE^A. GCroCIGIQS AQQCRGGfiGA 7050 

TGGSOGGCAA. CATCADCaGG GTIGAGICS^ ^GAACAAAGT GGIGATICIG 7100 

GA L ' iUL ' i ' lUi ATOOSCTIGr G3CAG?iG3AG GAIGAGCX333 AGGICTOOST 7150 

ACCT3CAGAA ATI CIG GG G A. AGTCICG GA G AH'ICOXCOG Q003IQ0033 7200 

TUIGGG0303 GOGSGACEAC AAO3C00G3C TSGI3^GAGAC GI03AAAAAG 7250 

OCIGACTOS AACGAOCIGT GGIGCATOQC TG0G03CTAC CACCICCAOG 7300 

GIODaCIGCT GIQC3CIOCX3C CIOGGAAAAA. GCXICACG3IG (JiUCTCACOS 7350 

AATCAAODCT ATCI3OT333 TTOQCOGPGC TIGC3CAD3AA. AAGnTIGSC 7400 

AGCICCICAA. CrroOGSCAT TAD333aGAC AATACGACAA CAIOCICIGA. 7450 

GC0O3CXD0CT TCIGQCIQGC OOCOOGACIC OSAOGITGAG ICUJJAli'iUiT 7500 

GCAIQQOGOC CCIGGAOGGG GAGGCIGGOG Aia03GATCr CAGG3A0 GG3 7550 

TCATQGIGGA. OQGICAGIS^ TGG3G 0 G 3 AC AOGGAAGAIG TOSIGIGCIG 7600 

FIG. 4D 
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CICAKIGICT ISVTiaCIGGA. CAa303C3CT GGTCAaOOOS TOOOCIGUGL} 7650 

AAG2^ACAAAA AC lG GOCftIC AAOaCACIGA QGAACTOSIT QCmXXXXT 7700 

CACAATCIG3 TGfEftTTOCaC GACnCAOSC AUiUL'i'iUUJ AAAOaCAGRA 7750 

GAAAGICACA TTIGACAGAC TQC3lW3ncr GGACftOOCAT laO CaGGa OS 7800 

IQCICAftGGA. QSICAAAQCA aa33QGICAA AfiGIGAAG3C TAAL'i'iUL'iA 7850 

TOOGTAGftOG Aft CjL ' i ' lGCA G OCIGAOaOOC a30mCM3 OCAAMXDCAA 7900 

GrnG3CEAT a3GS3AAAAG AOGICXCTIG CJCAIOQCfiGA. AflQaOCEnaG 7950 

OXACATCAA ClUmiUlG G AAftC3AGCnC TQGA200^ TGIAACACXA 8000 

MaGACACm CrATCZOGOC: CS^ftGAAOGRG GTmCIQCE TICaGOCIGA 8050 

GAAGaooaGT osis^AGCxaG ciasiCTxar oiiunuxc GAanaoQaG sioo 

■lULUJ^I^T G aafiGAftGKIG GOUC'IGD^ AOSIGGTITAG CAAQCIOCDC: 8150 

ciGoaaaiGA. toogaagcic cuoaGAnc caaiacicac caggacaqog 8200 

QSriGAATIC ClUC J IG CAAG CGIOGAAGIC CAAGAAGACC OOGAIGOOGr 8250 

TCTOGIMGA TACQOaCIGT TTIGACTOCA. CAGICACIGA GAGOGACATC 8300 

CGEAGQGAGG AGaCAATITA. CX3^AIGITGr GADCIGSAOC COCAAGaOOG 8350 

CGiOaaCATC AAGICOCTCA CIGAGAGGCT TIJ^TCnGGG OaOQCflCnA 8400 

CXS^ATICAAG QOOaGAAAAC TOOGQCTACC GCAQCJIGGaG CGCXSAQQGGC 8450 

GTACIGACAA CTAGCIGIGG TAACACXXTC ACTIGCrACA ICA AGGOOOS 8500 

aaCAQCCICT CXSAGCXDSCAG GGCTOCAQGA. CIGCAGCATC CTOGIGIGIG 8550 

GGGAOGACTT A LSlUUi ' Jm C TGIGAAAGIG OQGQGGIDCA G3AG3A03CG 8600 

QQGAGOnGA GAGCXZnCAC G3A03CIAIG ACCAQGEACT OOGOOOOOOC 8650 
OGG33AO00C OCACAAOCAG AATAOGACIT GGAGCITAIA ACAlKgflQCT 

anOCAACGfT GicAGiaxr cacgagcxxe ciogaaagag ggiciaceac 

CITACaCXSIG AOC3CIACAAC CrXDCCTOGOG AGPO00333T G 3GAGACAGC 
AAGACACACT (XAGICAATT CCIGGCTAGG CAAC ATAAIC Ali Ui'l'lUJUJ 
CCACACIGIG GOOGAGGAIG AIACIGATCA 0C3CMTICIT TALUJiLL'iU 
ATAGOCAGGG AICAQCITGA ACAGJL'iUi'r AACIGIGAGA TC!I!A03 GRGC 
CiQCIACia: AaaGAACCAC T3C»TCI3^ TOCS^ATCATT GAAAGACIOC 
ATOGCXnCAG OGCa iiTl ' lLA CICC30Grr ACICiaaGG TGAAA ICAAT 
AQQSIQQCJaG CATOCmCAG AAAACnOQS GTOOOGCXXT TGCXSAOCTIG 

GAGACACOQG GODCQGAGOG T03303CTAG GL'i'iUiGIUC AGAGGAGOCA 9150 

GQ3CTOCCAT ADXJD33CSAG TAOCICTICA ACIQGQCPCT AAGAACAAAG 9200 

CICAAACICA CICCPATAGC QQUaJO'iGGC: CQGCTQGACT •j^iuv^r iG 9250 

GnCAOQGCr G3C33SOQ0G GGOGAGACAT TEATCACAGC GIGICICAIG 9300 

CODGGOaXG UiUJriCTGG TTnODCEAC TCCIGCTOQC TQCAGGOGEA 9350 

QGCAICIACC TCCICCXXSA 00GATCAAG3 TTG333rAAA CAuiU-CQ^Cr 9400 

TcrrAAGCCA ' ri ' iLUiuri ' i ' ' i ' i'n'i ' i ' i ' i ' i ' i ' 'i'i'i'i' i'i'iTiT 'r nTiLTn T 9450 

Tri ' i ' i ' i ' iL:iT TarmcciT cri ' i ' i'i ' i ' iLL Tncmric cxmcnrAA 9500 

FIG. 4E 
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TOGioaciac ATcnaoaoc iMsrcpcaoc tagcigigaa msskhalx a 9550 

QOCOCAIGAC TSCaGAGRGfT GCIGKERCIG GOCICICIGC AGATCAaOT 9599 

FIG. 4F 
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1©INEKECPK TKRNINRPPQ I^/KEPGOOQI V33VXtIflE?R CSSUSJBPUR SO 

KISERSQEEG RRQPIHWRR PB3RIWAQPG YiWELYONEE dS^OWLLSP 100 

KSSRPSSWGaPT DPRHFSRNLG K V UJ rL TOGF AEOGYIPLV GAED3GAARA. ISO 

LAH3VRVLED GVNYZOGNLP QCSFSIFIiA LLSCL1VPAS 200 

YHVINDCHSIS SIWEAAEAI IHTPQCVECV EBa^^RCSftlV AVT ETO^TO D 250 

CaOPnOLMl HIEI1M3SAT LCSALYVGDL OSSVELVGQL PIESFRRWJr 300 

I^PCNCSIXP GHTIGHFMMJ IMMWSPIA ALVVSVQLIKL PQADOiraG 350 

JVHWGVIAGIA "XFSMyGNWftK VLWLLLERG VDAEIHVT3G NRGRITOGLV 400 

GLLTE<3«OT IQEJNINGSW HINSIS^IICN ESLNTSnlLAG LFY^B^NSS 450 

GCEERIASZR PLTEE7^QG5rtG PIS!ffiN3SGL DERPXCtrJEKP ERKXSCVPAK 500 

SVaSEVXCFT PSPWVGITD PSGAFIYaftG ANEHEVEVIN NIRPELi3SWF 550 

QCIWMNSIGF TKVOS^PCV IGGVGNNHiL CPIDCFHKHP EftlYSRCGSG 600 

PWITFIOM) YPYIO«!fPC TIN^^ 650 

pCEtECRERS ELSELLLOT? QWQVIiECSFr TLPALSTCLJ: HLH^CCVDVQ 700 

"XLYGVGSSIA SW?\IKWEXW T.T.FTT JADftR VCSCIJ&\M«LL I93AEAAL£N 750 

LVm^AASIA GIH3LVSELV FFCFAWYLKG EWVPGAVYSL "S^GMWELLLLL 800 
LALJQRM£AL DTEVT^ASCGG WLVGLMALT LSPYYKRYIS Wa^AIrttQXFL 850 
TE5y7EAQL«VW VPPLIQVEGGR DKymiCW HPILVEDriK IIIAIPGELW 900 
IU3ASLLKVP "XFVRyQSLLR ICALRFKIAG OKVOMIK DSALTSTXVY 950 

NHLTPLRDWA HN3LRDLAVA VEEWFSRME TKLTIWSADr AAD3DIIN3L 1000 

PTSARPQQEI LUGPADO-IVS KGWRIXAPTT ASffiQCfTRGLL QCITTSLTSl 1050 

IVSIAIQIFL ATCUGVOT? VYHSAGIRIT ASEKSEVIQyi 1100 

YINVD2DLVG WPAFQGSRSL TPCICGSSDL YLVTRHMVI EVPPFGDSRG 1150 

SLLSPRPISY LK3SSC33ELL CEA£3ffiVGLF PAAVCTRGWA KAVDETPVEN 1200 

LCTIMRSIVF TEWSSPPAVP QSFQWRHLHA PI GgGKSIK V EAMaAQqKK 1250 

VLVINPSVAA TLGPGKXMSK MEVDiOTFT GVKi'iTiGSP TIYSTXCaCFL 1300 

ADGaCSaG?Cf DIIICEffiEHS TDATSID3IG TVLDQAEIRG AELWLAnO? 1350 

PrcSVTVSHP NIEEWALSnr GEIPFXGKAI ELEV3KaC3?H KCFOEKKKC 1400 

DELAAKLVAL GINAVAYYPG m/SVIPTSG DTVWSIEftL MIGPIOTDS 1450 

VIDCNICVIQ TVDFSLDPIF TZEITIIiPQD AVSRICPR31 1C3?i2CPCTXR 1500 

FVATGEKPSG MFDSSVDZEC YDAQCMWYEL TPAEnVPUR AMXNTPGLfV 1550 

032iLEIWEG VFIGLTHIEA HFLSOIKQSG AIVCARAQAP 1600 

PPSWD3XSMKC LIPLKPrLHS PTPLLYRIiSA VONEVTLIKP mCOMICMS 1650 

ADLEWISIW VLVGC3VLAAL AAYCL£nX3CV VIV^^m/LSG KPAIIEDREV 1700 

LYQEFKMEE CSCJiLPraQ GMMLAEQFFQ K ATGr . TCJ IAS RHAE7ITPAV 1750 

QINWQKLEVF WAKHM/^NETS GIC3XIAGLST LE05PAIASL MAFIAAVISP 1800 
LTIOOTUFN IU3C3WVAAQL AAPGAM3AFV C3AGLAGAAIG SVGDGKVLVD 1850 
HAGVSSAGV/A GALVAHOMS GEVPSTEDLV NLLPAILSPG ALWGWCAA 1900 
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m?RHVGPGE GAVQWMNFLI AEASEO^HVS PIHWPESEA AAFWTMXSS 1950 

LTV rj.TP PT. H3WISSECTr PC9GSWLRDI WDWICEVI£D FKIWLKAKLM 2000 

igLEGIEFVS CQRCmSVWR GDGIMHIIOi aSAETTCHVK N3IMRIVGER 2050 

ICRNM/gSGIF PINAYTIGEC TELPAEN^ilKF AEWFWSAEEY VEIKEOTffH 2100 

'i:V33^niNL KCKQIPSEE FPIELDSVRL HE^EAPPCKEL lEKbVSb'KVG 2150 

IHESPVGSQL ECEEEEEWAV LTSMLTDPSH TiaEAftGRRL ARGSPPSMAS 2200 

SSASQLSAPS HaOCIMJHD SEDAFTiTFaN lUWRQEMSaT TIFS^SENKV 2250 

VTLDSFCELV AfclhlMffi VSV PAKTTiRKSRR EARALPVWaR mflNEELWET 2300 

WECKECWEPPV VH3CELPPER SPPVPPERKK RWVLTESIL SI3^IAELMK 2350 

SEGSSSISGI TGmrnSSE PAPSGCPPDS EVESYSSMPP TiHHKKaUmL 2400 

SDGSW5IVSS GADIEDWD: SMSYSWIGAL VITCAAEECK LPINALSNSL 2450 

IPHHNLWST 1SRSADQRQK KVIHED3VL DSHOT^I^ VKAAASKVKA 2500 

NLLSVEEACS LTPEHSAKSK FG!i?3Am;EC HAEKAVAHIN SVWKDLLEDS 2550 

VrPlUi ' l'lM A KNEVPCVQPE KQGRKEAELI VFPDD3VRVC EKMALXEWS 2600 

KLELAVM3SS "XGFU^SPaQR VEFLVQMME^ KKTEMSFSYD UOESIVIE 2650 

SDIRIEEAiy QCCDLDPQAR VAIKSLTERL YVQSPLTNSR GENDSiKOl 2700 

ASG7LTISCX3 NILTCYIKAR AACRAAGU3D CIMLV0a2X WICESAGVQ 2750 

ECftASLRAFT EAMIRYSAPP GDPPQPEXDL ELTTSCSSNV SVAHEX3AGKR 2800 

VYYLTRDPIT ELARAAWEIA RHTPVNSWDS NIIMFAFIUW ARMUMIHEF 2850 

SVLIAPDQLE QALNCEIYGA CYSIEPLDLP PIIQRLHSLS AFSIHSYSPG 2900 

EINRyAACLR KLGVPPLRAW RHRAPSVRAR T.T.SRQCS^AAI GGKYLFNWAV 2950 

RIKLECLTPIA AAGRUCL9GW Fma£933Dl YKSVSHARPR WFVJEmilA 3000 
AGVGIYLLEN R 

FIG. 4H 
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5' UTR 
( 341 nt ) 



Genome of strain HC-J4 of hepatitis C virus 



OFR (9,030 nt) 



3' UTR 
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GGX^AQOOQCX: 1GAI033Q3C GACACIOCaC CSOGAMCAC TOOOCIGItSA. 50 

QG^^ACTACIG 1CriCACX3CA GAAAQCGICT AQaCZmaQOG TmSmU&G 100 

TCIDSIGCAG OCIOCAQGftC OOGOOOXXX: GGGftGftGGCA Ta GIGS ICIG 150 

a3GAACa3Gfr GAGrnftCAOOG GAMTOOCaG GAOGACOQQG TOCITICriG 200 

GftTCAACXXXS CICAMQC3CT QGAGATnOG GCXJOOOQCXr QOGAGACIQC 250 

TAGCC3GAGIA UiUl'lUQCnC QCX3AAAQQCC TIGIGGTEftCT QOCTCAmaG 300 

GIQCriGOGA GTOOaXOGG AGGTECiaJEA GACOGflQCAC CATCftGCAOS 350 

AATCCIAAAC CICAAAGAAA AACCAAAOGT AACADCAACC aXQCXXACA 400 

QGACGflCAAG TiaXXaQOOG GIQGICaGRT (JUi'iGGJi\3GA GfTITRCJCIGr 450 

1G00GCX3CftG QGQGCXraGG TiUUiJIGTOC GCX30GftC?iaG g^ftGGCTIOC 500 

GftOaOCSroQC AAOCTOGIGG AAQOOGAOA CCTMGOCSA AGQCTOQOOG 550 

ACQCXSAGQQC AQQQCCIG9G CICAGCXX3QG GI30CTIGG CCCCTCYPOG 600 

GCAAT3AQGG CCIQQQGIQG QCAQSAiaOC TOCICTCACC CXX30 aGCT 0C 650 

QGQCCEAGIT 000300X^0 GGACXX3CCX3G OCTAGGTOaC GTAACnOOG 700 

TAAQGrDCATC GATftDOCriA CAIOaQQCIT aBQCXSMTTC ATOaOGflJACA 750 

T3x:oacrcG?r aoaoQCDOoc ceaqqqqqcxs ciodcaqqqc criGQCftCAC soo 

GGIGICOGGG TICIQGAQGA OQQOGIGAftC TftlQCAACftG GGftACTIGCX: 850 

CQGTIQCICr 'I'lUlUliATCT lOJiL'i'iUQC TCIQCIGIQC TGITTGNOCA 900 

ICCCAQCnC OQCTTAIGAA GIODaCAADG 1GICXX3QGftT ATADCftlGIC 950 

ACXSAAOGACT QCICCAACIC AAGCATIGIG TAIGAQC3CAG CXaGADGTCAT 1000 

CATGCATACr CC033GTGCG TQGOCIGIGT ICAQGAGOGfT AACAQCTOCX: 1050 

Gi'lGCI G GGrr AQCX3CICacr OOCAOQCTOG OQQCCAQGAA TQOCftQOGIC 1100 

CCXaCTAOSA. CAATftOGACG CX3V3C3ia3AC 'i'lUL'iLWi'iG QGACQGCIQC 1150 

TITCroCICC QCIATCTACG TOGOOGATCT CIQOQGATCT ATmOCTOG 1200 

TCiaXAQCr GITCACCriC TCGGJZTCGCC GQCATGAGAC AGIGCA QGAC 1250 

TCCAACIX3CT CAATCIATCC 0QGQCA2GTA TCAGGICACC GCAIOGCnG 1300 

QGATA.TGAIG ArQ?^ACIQC3r CAlXTACAAC AGCQCEAGIG GIGTOQCAGT 1350 

TCCICQGGAT CQCACAAGCT G'lCGTCGftCA. 1Q3IQaCX3aG GQaOCftCIQG 1400 

GGAGICCIGG OQGQQCTIQC CD^CTftTICC AiaGTAQOGA ACIQQQCEAA 1450 

QOnCIGATr GIQQQQCnS^ ' lUri ' lG OOGG Q3nGAa9C3G GAGACOCACA 1500 

GGACXSaOGAG GGIQQQOCXaC CACACCS^OCT CQQaGITCAC GBXCimC 1550 

TCATCIGOOG CGICICAGAA. AATOCAQCIT GIGAATACCA ACQQCAQCIG 1600 

QCACATCAAC AQGACIQOCC TAAATIQCS^ TGACTCOCTC CAAACIQQGT 1650 

TCrnGQOQC GCIGITITAC GCACACAAGT TCAACTOGIC 0QQGIQCXXX3 1700 

GAQQQCAIQG CCAQCTQOOG CXXX»TIGAC IQGnTOQCOC AGQQaiQQGG 1750 

CCCCATCACC TAIACTAAQC CTAACAQCIC QGATCAGAGG CCnATiacr 1800 

GQCATTACXaC GCXrCQGAOCX; TGIGGIGTOG TACCCXaOGIC QCAQGIGIGT 1850 

QGIGCAGIGT Al'lGl ' l ' lC AC CCCAAQCOCT GTIGIGGIGG QGAOHACCXSA 1900 
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'lUJ riO CX3Gr GTCCCmJjr ATAGCIGQQG GGftGAAIGAG ACft GftOSI Gft. 1950 

IQCIOCICAA CAACAOaOSr COOaCACAftG QCAACIQSrr (JULjL' lGilftC R. 2000 

IGGftlGAASA GTACIGSSTT CACTAAGftOS IGOaCSOSIC CULUJ IUIAA. 2050 

CAIC333QQ3 GTOOCTAADC QCAC3CTIGAT CIQQOCrftOS GAL'iUL'i'iUJ 2100 

GGAftGCAOCX: OGAGaCTftCT TACACfiAAAT QIG3CTJGC3G ULUJ iUJl'IG 2150 

ACACCIMSr QCTAGTC^ CTftOCXMaC AQ3CmQ3C ACIWJGGC'IG 2200 

cAcicicAKT TrrrocATCr TiRAasnaG GanuimuiG QQQaoacsiQS 2250 

AQC300r:r CAA3G0CX3C3^ TQCS^ATI^ 2300 

TIGMGACA GOGaaaaSIC AGAACICftGC UUUL'iUL'iLC I GICPO tfVC 2350 

AGAGIQQCAG AEACIGOOCT GiULTi'iCAC CAODCUmS GCT mTOC A 2400 

CIUJITILA T OCATCIOCAT CAGAACAIOS T33ADC3IGCA AEAUJiUilAC 2450 

GJEGIAQQCir CAQOGmGr L'iUL'i ' l ' lU ZA A ICAAATO3G AGIRCAICC r 2500 

Ui ' lULTlTiC CriCICCTOS CAGACX303CX3 OSIGIGIQCX: TSCTIGiaGA 2550 

TCAIQCIQZr GAnS^QCXTAG GCIGAQ3CX3G CCITAGAGAA CTIGSIGSIC 2600 

CiCAAIGaQG Oji U LUIGG C CQ3AQCX3C3^ QGTCAmCCTCr ULTriLTlUi' 2650 

GTICnCIGC QCX:Q0CTQ3r ACATTAAQOG CAQQC'IQGCT CCiaQQaaQG 2700 

CGTATGCnr TTAIOSCXnA TOCIQCIOrr ACIQQOSITA 2750 

CJCACCAOGAG CnAOXXnT GGADCX33GAG AIQ3ZD 3CAT OSIGOQQGQG 2800 

iGcxssncrr GrAG3IxnQ3 TATICTTGAC CTIGICACCA TACIACAAAG 2850 

UUi ' i'iL ' lC AC TAGXTCATA TOGIGGITAC AATAL'i'i'iAT CACJCAGAQCX: 2900 

GAGGGGCACA TQCAAGIGIG GGIOXCCXX: CTCAACGTIC QGG 2AGGOCX5 2950 

GGAIQCXAIC ATXXTICCICA CX3IGTG33GT TCA.TCCAGAG TTAAl'lTi'lU 3000 

ACATCA.GCAA ACICCIGCTC GOCAIACTOS G3333ZrCAT QGIQCICO^ 3050 

GTD33CATAA CGAGAGIGOC GIAL'l'iUGIG Q3D3ZECAAG QOCICAnCG 3100 

TOCAIGCATC TTAGIOCXSAA AAGIO30CX3G GGGICAITAT GTOCA AAIGG 3150 

ICnCAIGAA GCTOOGOGCXS CIGACAQCTA CXSTAOSITIA TAAGCa^TCIT 3200 

ACCECACIGC: OaGACIQGGC 0CAQaCaG3C CTA03AGACC TiULgGTGGC 3250 

G33AGAQC3CX: Gl UJlL'i'l CI ' CaaOCATOGA GAOCAAQGIC ATCALl-'lU^ 3300 

GAQCAGACAC CQCTGOGIGT GGGGACATCA TCTIGGGICT AOCUJiC ia: 3350 

GCCOSAAGGG G3AA0GAGAT Al ' lTl'lU GGA CDGGZEGATA GICTOGAAQG 3400 

GCAAG3GIG3 CGACTCCTIG CGCX]CAICAC GSGCTACTOC CAACAAADQC 3450 

03Q3CGrACr IGSnGGATC ATCACIAGOC ICACAOaOaG GG ACAAGAAC 3500 

CAOGTOGAAG QOGAGGnCA AGIGGITICr AGCDQCAACAC AATCITICCr 3550 

GGCX3ACCT3C ATCAACG3CX3 TGTOCJIOSAC TGICTADCAT GQDOZIQOCT 3600 

GGAAGACCCr AGCCX33IOCA AAAQGTCCAA TCACCCAAAT GTACADCAAT 3650 

GTAGACCIQG AC^CICGTOSS CIGQ2A03D3 OCXDGCXD3G3G QGCX3CIQCAT 3700 

GACACCATGC AGCICT3GCA GCTGGGADCT TTALTiOJEC AO GAGAC ATC 3750 

CIGATGTC^T TCCG3IG332 0G3GCSO3CX3 ACAGCAQG3G AAGICTACIC 3800 
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TOocccA^ gjuiciccm ctjsaaggc: lajiaGaGfi G CTO^nacr 3850 

TTOOCCrroG OJULJAUJIU S 1UJlJt.l.-{iU iT a.U.4T. ' iGCT (JiUiUUAODC 3900 

QGQGQSias: GAftGOCJOSIG GACTIOm^ GOSITGAGIC TME GGRftA CT 3950 

Aocarocxscsr ciaoQcsicrr cacagacaac tcaacccxxx: a33ciGiaoc 4000 

QCAGACATIC CAAGiaaC3^ AICiaC3Va3C TOCTACIOaC A GCQGCAAGA 4050 

QCADGAAAGT OCXDOaCTEQaS TATOCAQOCX: AAa3GrEACAA QSIQCTOCTC 4100 

CIGAACOOGT C3CX?nG003C CAIJLT1:A G 33 Ti ' lUGm CT jCTATOTOCaA. 4150 

QQCACAaSGr ATCGACX3CIA ACATCAGAAC TOQS^AQG AOCSmaCTA 4200 

QQG3CTG3-IC CATISmCM: TOGAOdKIG GCAAbTlLLT 'lGC03Aa3G?r 4250 

GaciGITCIG GQCXSOQCCIA TGACATCAIA. jajfl^TIGAIG ASIOCTACIC 4300 

AACIGACIOS ACIAOCAICr T333CAI033 CACALJiUClG G AGCAAGOG S 4350 

j^GACG3Cro3 AQC]a033CIC GIOGIQCIDG C]CACX33CTAC AOTCT33GA 4400 

ia3griAOCX3 TGOCACAOa: CAAIATOGAG GAAA IAG30 C '1G1QCAACAA 4450 

IQGAGAGATC COCTICIATC GC3^AAQ3C3a' CCX]C3OTGAG G0C3«rCAAGG 4500 

QQOaGAGQCA ICICAnnC TGCXaiTOCA AGAAGAAA3G IGAOGAGCIC 4550 

GCXDOCAAAGC TGACAGQ^CT CXaSACIGAAC GCTOIAaCAT ATTACXDQGGS 4600 

CCITGAIGIG TCCGICATAC CGGCIATCQG AGAOCTOGIT GIDGIGQCAA 4650 

CAGACGCIUr AA-IGACQGGT TICACmXG ATnTGACIC AGIGATO3AC 4700 

IQCAATACAT GIGICACCCA GACAGTOGAC TICAGCTIGG AIDOCAOCTr 4750 

CAOCATIGAG ACX3ADGAC0G T30GC3CAAGA a3aGGIGICX3 CUJILI^C AAC ""'^'^ 
OaOGAGGIAG AACIGGCAGG GGIAGGAGIG GCAICIACAG GTT^nS^ 
CCAQ3AGAAC GOaOCTCGGG CA3GIT33AT ICTTOGGICC TCIGIGAGIG 
CTATCACGOS GXTCIGCIT QGTATCAGCr CACGQGOXT GAGAOCTOGG 
1TA03nGGG GGCITACCm AATACAOCAG GGTIGCCOGr CIGOCAGGAC 
CAICIQGAGT TUIOaGAGAG CGICTICACA GGXTICACCC ACAIAGATOC 

OCALTICCIG TCOCAGACIA J^ACAGGCAQG AGACAACITT OCITAOCTGG 5100 

•nSQCAIATCA AQCEACAGIG TGOGOCAOGG CICAAQCICC ACCICCATOG 5150 

1GQGACCAAA TCK3C3AAGIG ICICAIAIDGG CIGAAAQCIA CACiaCACX3G 5200 

GCXAACACOC CIUL'IGTAIA. GGCIAGGAGC OGTOCAAAAT GAGGIC ATCC 5250 

TCACACACQC CATAACTAAA TACATCATOG CATOGAIGIC QQCJIG AOCTG 5300 

CSAQGIOSICA CTAQCAOITC QGIQCIG3IA GGOG GAGIO: TiaCAQCTIT 5350 

QQCJOaCATAC TQOCIGAOGA. GAGQCAGIGr GGICATIGIG GGCagGATCA 5400 

TCriGroCQG GAAQCCAGCr GlUJriCCC G ACA03GAAGT GCICTAOCAG 5450 

GAGnaSATG AGATOSAAGA GIGIGCTCA CAACnCCTT ^M^QCA 5500 

GGGAATOCAG CroOOTAGC AATICAAQCA AAAGGOQCIC GGGITGnQC 5550 

AAAOaOCXAC CAAGCAAGCB GAG3CianG CIOCCXSIGGr QGAGICCAAG 5600 

TGOCXSAOCCX: TIGAGACCIT CIQQQQGAAG CACATGI03A ATTICATCAG 5650 

CG3AATACAG TACCIAGCAG GCTEATOCAC TCIGCTTOGA AACOXXXXS^ 5700 

FIG, I4C 
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4850 
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5000 
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TAQCMCRTr GAIG3CAnT ACAQL'i'iL'IA ICACEAGOOC OZLCAOCftOC 5750 

CAAAACACXX: TC UIUITIIA A. CmL'i'iUGQG GGKra33IQ3 CiGGOCAACT 5800 

CX3L'iCC1CCC P£3J3CJX303T CM3JITLUJI' G3300C033C ATOQCTGftG 5850 

OOQCIGTIGG CAQCRIRGQC CnG3C3AAQ3 TOLIOJIGGA C3^LTiUL*JG 5900 

GacrAIG3GG CftLXjmilAQC (DOaOQCACIC GIG3LUi'i'iA AQSICAIGAG 5950 

CX3QC3GftG3IG OCCIDCAODG AGSAOCIOGT C^ACITACIC CCIOGCmaL' 6000 

TCICTCCIQG TOOOC'iaSIC GfID333G?ID3 TaiGCX3CAGC ^J^TMJTGJST 6050 

a33CACX3IQ3 aCXXI333AGA G3QQC3CJD3IG CAGIGC3AIGA AUUUJL'iUAT 6100 

AG mriUCCT Ta3CQ333IA AOCftOGflCIC ULXJUACOCaC TRIVSIGLL'IU 6150 

AGAQOCSAGOC: 1QC3^QCAC3Grr 'iUL'lL'lL'JJAG ULTllACCaTC 6200 

ACICAACIQC TCAAGQQOT (XACXAGIQS ATEAMX3AQS ACIQCTCflAC 6250 

QOCATOCroC GQCTO3IQ3C: TAAQ33AIGT TIQ33ATIG3 A33m3CA03G 6300 

lUITGACIGA. CriCAAGACX: TGQCIDCAGT CXZAAACIOCT GULUOJUrTm 6350 

CXX33GAGr[CC Ci ' i'lLUiGiC ATQCCP^RCOZ Q3CTACAAQG GAGTCIGGCXS 6400 

GC333GAD3QC ATCAIQCAAA. CCACCIGOX AIQD3GAaCA CAGATO3CXDG 6450 

GACA3GICAA AAAOaSTIDC AIGAQGATCG TAQOaOCEAG AAQCrTGCaQC 6500 

AftCACCJEQQC A03GAACX3TT CCXDCATCAAC QCAIACADCA OaOGACXinG 6550 

CTyS^ODCia: CX:QQCX3CCCA. ACTATTCCAG GC3CX3CTAI03 Q333K3GCIG 6600 

CmAOSAGJIA caiGGAQGnr Aa30GIGIG3 OaClzOTIDCZA CTAaJPSAOG 6650 

GGCAIGACTA CTGACAACXJT AAAGT0332A -roCCAQGnC CX3GCXXXXDGA. 6700 

ATIUITCADS GAOSIOGATS GAGIQGQ3IT GZACAQCTAC QCTCXDGGCXir 6750 

QCAAAGCICr ICIACXaQGAG GAOCTCACX?! TCCN3JI033 GCICAADCAA. 6800 

T AC i' iUJiL G GSTOQCAGZT C]QCAIQOGAG CXXX3AAOOC3G MD3TMC?GT 6850 

QCITACITGC AIGTICACOS ATCCCTOOCA CATTACAGCA GAGAOGOCIA 6900 

AGCGIAOaCT G3CTAGAQ33 TdOCGCXTT CnTAGOCAG CrCAICAQCT 6950 

AQOCALSi'l Ur CT33GQCTIC TTIGAAQCSDG ACATOCACIA COCACXZAIG^l 7000 

CiaXXD3GAC QCIGACCKA TCGAOaCXIAA CCICTIGIGS CQ3CAGGAGA 7050 

T3aaa3GAAA CMCACICX3C GIQGAGICAG AGAATAAQST AGTAATICIG 7100 

GACICirrOS AAOOOCnCA OQCXSGAQCSCSS CaiGAGAOOS AGATAIODGfr 7150 

a30330C33AG ATCCIGCXSAA AATCCAOGAA GnOOCTTCA. aJJi'iUJ-LA 7200 

TAIG33CADG COCGGB^CIAC AATCCIOCAC TGCIAGAGIC CIGGSAQGAC 7250 

CCX33ACIACG TCCCICOGGT TQCCCKTIOZ CACCEACCAA 7300 

GGCICCICCA ATADCAOTIC CACXXIAGAAA G.A03.Aa33IT (JICCTGACAG 7350 

AATCCAAIGT GICnCIQCX: TIGC3CX3GAGC TCODCACrAA GAOnTCGCT 7400 

AQCICaSSAT OCTOQOCnjr TCATAaOQ3C AOSaOGftOGG CGCnOCIG?^ 7450 

CCIQ3CCIOC GACGAC3G3IG ACAAAQ3AIC 03ACX?nGAG -lOJCACICCr 7500 

QC12t.T3QCmr CmGAAG33 aAG2GC33333 AD3CXDGAICT CAQ0GAC03G 7550 

TX.'x'l(JJlL'rA CGCJK^^CTS^ GS-AGSZCAGT G.AG3AIGICG TCIGCIQCIC 7600 

FIG. I4D 
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AAIGTCCIAT ACGIQGAC?^ GOaUUL'ilaAT CA03CCATOC QCIG333AQG 7650 
AAAGEAftGCr GOOCaTCAAC OOGTIGAGCA. AL'iLTi'iULT GaSICAOCS^ 7700 
AACA3G3ICr ACXSOCaCAAC ATCOaQCAOC: QCAAOXia:: Q32ftGAAGAA 7750 
qgiCA ULTlT GACAGATIQC AAGIDCIOGA. IGAICATE?^ OSaSAQSiaC 7800 
1CAA03AGAT GAAG3a3AAG GULJiCCAC3^ TIMQ3:3AA QL'i'iUimCi' 7850 
AIAGAGGAQG CCIGCAAGCT GAOOOCTOCA CKt'lTJSXCA AM^AfiTT 7900 
1QQCTA3G3G GCAAA03ACX3 TOD3GAAQCT AI0C3^GC3^ QJJJi'l^AROC 7950 
ACAIOOQCIC 0C?IGiaC3GAG GALTiUL'IOS AAGAC30X3A AAGAGGMTT 8000 
GACACCAIXA TCAIQ3C3W^ AAfSIGAOCJCT TiUiGCXSIOC AAOCAG ftGAA. 8050 
QGGAQSXGC AAGOCAQCTC QUL'i'imLGT ATIOICAGAC CIGOSAGTIC 8100 
GIGI2ATOCC3A GAAGAIG3QC CTTEAOGAOS T3JiCiCCAC OCTIDCICAG 8150 
GJJIKLA1X33 QCIOriCATA OGSATTICAA lACIGOaCCA A02AGCX3Q3r 8200 
CGAGTIGCIG GIXSAATAOCr GGAAA3CAAA GAAAIGOCXTT AIGGOTTCT 8250 
CATAIGACAC CJQG L ' lUi ' i ' 1 ' 1 ' GACICAA093 TCACIGAGAG TGACAllOSr 8300 
GTIGAOGAGr CftATITACJCA AIGITGIGAC TTOOaaaDOG AOQOCAGRCA 8350 
G3CCATAAQG TCQCHCACAG AQOC33CTnA CAiaSGOaGT COXTGACIA 8400 
ACICAAAAG3 G2AGAACIQC OZTTATOSDC G3IQaCX3CX3C AACTGGCGIG 8450 
CIGACGACTA GCTG033IAA TADCXICCACA TGTTACriGA AQ3CCACK3C 8500 
AGCCIUICGA GCTOCAAAGC TDCIAGSACIG CACGAIQCIC Gr3^A03GAG 8550 
ACXiACCnur CGirAlUIGr GAAAGOGOGG GAAOXAQGA GGAIGGQOOG 8600 
GCCCTACGAG arnCACX33A GGCTATCACT AGJiATICOG CODOCXDOOGG 8650 
G3ATCCX3GCC CAACTAGAAT AaSAOCTSSA GCIGATAACA TCAIUTIDCT 8700 
CCAATGIGIC Ai3ia3C33CAC GATXATCIG G2AAAAG33r ATACTACXTIC 8750 
ACCCGIGACC CX3OZA0CXX: CXTTIGCAOSG GTimJCGSS AGACAGZCAG 8800 
ACACACTCCA ATCAACICIT G3CIAD3CAA TATCAICATG TATGGQOCJCA 8850 

8900 
8950 
9000 
9050 
9100 



CCCIA'roaGC AAGGAIGATT CIGAIGACIC ALTilllLlU CALLLL'riL'rA 
GCICAAGAGC AACTIGAAAA AGCXCTGGAT TGICAGATCT AOGQaQCTIG 
CTACICCATT CaOCCACnG AOCIACXnCA GATCATIGAA CXSACICTATG 
GIUnAOCXaC ATITACACIC CACAGITACr CICCAOSIGA GAICAATAGG 
GIG3CTICAT QCCICAQ3AA ACTIGGGCTA CXIIACCCTIGC GAACCTQGAG 
ACATCQQGCX: AGAACIGIDT GCGCTAAQTr ACIGTCXXZAG QQQSQGAQCSS 9150 
CCGGCACriG TG32AGATAC CiUl'l'iA ACT G33CAGIAAG CaOCAAQCTT 9200 
AAACICACIC GAAIGCOGS^ COGGiaCTAG CIQCSACnGr CIGSJlUb'rr 9250 
CGTCQCIGSr TACAQOaGSG GAGACATATA TCACAGCXIEG TCTOJEQaOC 9300 
GACXXCQCIG GTTICCXnTS TQCTTACIOC TAL'iTiUiUi' AGGSCTAQGC 9350 
ATTTACCIGC TCCCCAACCG AIGAAD3333 AOTTAACCAC TOCAGGOCTT 9400 

AAGCCATTTC ciumTiii ■ ■Ti'iTiTriT TrnrmTT TcrmrnT 9450 

' I 'i -!. ' -T ' L C^ -TT-cricnT TITIOCTITC Ti ' l ' l'lL CCTr CTTTAATGGI' 9500 

' FIG. !4E 
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10 20 30 40 50 

p^d^fi7R9Q 1 ?'^il'^fi7R9Q l?^4'=;fi7RQn 12 ^4^67890 1234567890 

Q3CTXAICr laGOOCiaGr CACOaaaQC IGIGAAAQST aDSIGAOOOS 9550 
CftlGACIQCA GAGAGIQCIG MSOOacrT CiClOZAGAT CAIGT 9595 
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10 20 30 40 50 

T?^Ac;f:;7RQQ i^.-^d^filRgO 12M567890 12^4567890 1234567890 

MSINEECP3RK T3^RNINRFPQ UJKFEG33QX VGGVXLLERR GEKL37RMIR 50 

KASERSQEW3 RRQPIEKZaRR EEGR»©QPS ^^JWELYGNEG DSrJAGWLLSP 100 

I^SSRPSWGPT DERERSRND3 KVIDELTaCS" AEOCmLV GAED33AMIA 150 

I^^HSVRVLED GVNMOCP GCSFSIFLLA LLSCLTTPAS AYETRNVSGI 200 

YH7INDCSNS SIVYEZ^m/I MBirPaCVPCV QB3SISSEOJV AUT PILAft RN 250 

ASVPimRR HmiiVGIAA FCSAMYVGDL 03SIELV93L PIFSERRHET 300 

VQDCNCSIYP aiVSC3iPMftW nWMNWSPIT ALWSQSLLRI PQ^ WEl^gtfS 350 

/flWGVLAGLA YYSMVCaWftK VLDJMJ:FM5 VDGEIHTIGR VMSiriSGET 400 

SLFSSS^SQK IQLVNINGSW HINRIALN3Sr DSDSTDGETAA LraHKENSS 450 

GCEERMASCR PHWET^QOWG PTIYIKENSS D;??FYaMKA EREOSWPAS 500 

QVOGEVYCFT PSPWVGITD RSGVPIYSW3 ENEHIM^IIN NIRPPQGMrlF 550 

GCIV^MNSIGF IKTOOGPFOJ IGSVCa^RTLI CPIDCFRKHP EATmOGSG 600 

mUTPPCLVD YPYRIMCPC lUSFSIFKVR MXVQC3VEHRL NMOWISCE 650 

RCNLEIEERS ELSELLL^IT BWQILECAFT TLEALSIGLI HEHSPJIVWQ 700 

YLVfGVGSAFV SFAIKWEXIL LLFLXIATftR VCAdXMmL lAQAEAALEN 750 

LWLNAASVA GAH3ILSFLV FFCAftWYIKG RIAP3AM£AF YGVWELLLLL 800 

lALPPRAW. DREMAAS033 AVLVGLVFLT LSPYYKVELT PLIVMJ3iTi: 850 

TRAEAHM3VW VPPLIJVRGSR DAIULTCAV HPELIFDITK IllAELTSELM 900 

VLQAGITRVP YFVRAQGLIR ACMLVRKWAG OKVC^VFMK IXSALTGIWY 950 

NHLTELRDWA HAGLRDLAVA VEPWFSAME TKVTIWSADT AA03DIID3L 1000 

IVSAKRC3CEI ELGPADSLE33 QOWRLLAPIT AYSQ3TRGVL GZdTSLTm 1050 

DE^NQVEGEVQ WSIATQSEL AldNSVaATT VYH3AGSKTL AS^KGPZTO 1100 

YINVnLDLVG WQAPFGARSM TPCSCX3SSDL YLVIPHATVI EVRPRGDSRG 1150 

SLLSPRIVSY IJ<GSSC33PLi CP92iWGVF RAAVCTRGVA KAVDFIPVES 1200 

^3E^MRSR7F TENSTPPAV? QTFQJM^LHh PIG SGKSIK V PAA5i3^AQGXK 1250 

VLVLNPSVAA TDGRSAiMSK AHSIDENIRr GVRlTi'iUGS TTXSTXGKFL 1300 

ADOaCSQCSAY DIIICEECHS TDSTITLGIG TVLDSAETAG ABLWIAIftT 1350 

PFGSVIVPHP NIEEIGLSNN GEIPFYGKAI PIEAIKQC3?H UPCHSKKKC 1400 

DELAAKLTGL GLNAVAiTYBG LDvTSVIPPIG LVVWAmAL MIGPIGCEDS 1450 

VIDCNICTIQ T^7I:FSLDF^F TIEITIVPQD AVSRSS^BSl TGR^SCTYR 1500 

FVTPGERPSG MFDSSVLCEC YDAGCAWXEL TPAETSVRLR AXINTPCaJEV 1550 

CQEHLEFWES VFIGLTHlEA HE1OT5KQAG ENFPYLVKXQ A3VCARAQAP 1600 

PPSWD3tM<C LIRLKPIIK3 PTPLLYPLGA V^m/ILTHP ITECmiACMS 1650 

ADLEWI^ VLVG3VLAAL AAYCLTIGSV VTVGRIILSG KPAWPCf?EV 1700 

LYQEFDEMEE CASQI.PYIEQ GM3IAEQFKQ KADGLLQIAT KQAEAAAPW 1750 

ESKWRALETF WAKHMaNFTS GIQYLAGLST I^PQ^AIASL MAFIASITSP 1800 

LTigNIIXfN HJGGW^ZAAQL APPSAASAFV GAGIAGAAVG SIGLiSKVLVD 1850 

ILAGYGAG\/A GALVAEKVl-lS GEVPSTEDLV NLLPAJXSPG ALWGWCAA 1900 
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10 20 30 40 50 

-i-^.^A^fHR^O :i ?.^/l^fi7RQQ l? ^Aqfi7RQn 1234567890 1234567890 

HJREHVGPGE GKVQ&WNRLI AEASRC2^HVS PUKVEESDA. AftFTIQUSS 1950 

LTTIQLLKPL HgWINEECSr PCSGSWLSEV VCWICIVLflD FKrWLQSEOIi 2000 

PRLEG7EFLS 03ECYKGVWR GUiLMJi'iUP OSAQIAGHVK NSa^IVGER 2050 

ICSNIWHSTF PINRYTIGPC TPSPAHOYSR AUWE^^TAAEEY VEVTFWGDFH 2100 

Wia^riENV KCPC33VPAPE FFIEVDGVHL HRYAPACKEL LRECVTFgVG 2150 

IW3YLVG9QL ECEEEEDTIV LTSMLIDPSH ITAEIAKRRL AKGSPPSLAS 2200 

SSASQLSAPS liCMCITHHD STOMXJERN LLWRQEMSaT xxiW£iii!i«V 2250 

VnXSEEELH AHXERETSV AAEZLEKSRK FPSflLPUCVR ECKNEELIES 2300 

WKDETOVPJV VKSCEEiPPIK KPPIPPPFEK KrVVLlEaJV SSaiAEU fflK 2350 

TFGSSGSSAV DSCSEMMjED IASEXDGCECS EWESYSSMPP Lr t Tf - H aUirajb 2400 

SDGS(«SIVSE E3^SEIWaCS MSXTWIGALI TPCAAEESECL PINELSNSLL 2450 

RHHNMiraTr SRSASLE^QKK VTETS^LOTJD mYRDTIKEM KT^KASIV^ 2500 

LLSIEEACKL TPFHSAKSKF GYGAKEWRNL SSRAVNHIPS VWEESLLEnCE 2550 

TP UJi ' l'lM AK SEVPCVQPEK GGRKPARLW FPE3Li3VKVCE KMMIXEWVST 2600 

LFQAVM3SSY GE^SEKQFV EFLVNIWECSK KCHCFSYDT RCFDSIVTES 2650 

DIEVEESIYQ CCELAEEftRQ AIRSLTERLY IQGELINSKG aaD3i3?RCE^ 2700 

SCSVLTISaa^ TLICXIK^ ACRAMOQDC 1MLVN3XLV VICESA(3IQE 2750 

D?^AALRAFIE AMTFaTSAPPG DPPQPEinXE LTISCSSNVS VPHCf^SCSKRJ 2800" 

YY UlKDP i'i' P lARAN/^EIAR HTPINSWLJ3SI HMS^PTim RMEXM THEFS 2850 

ILiLAQEQLEK AUX^QIYG?^ YSIEPLIXiPQ IIERLH3LSA FTLHS^SPGE 2900 

INRVT^SCLRK liSTPPLRIVR HRARSVPAKL LSQQCSRAMX: GFYLFWrffiVR 2950 

IKLKLTPIPA ASGLDLSQ'AE' V7^SQ3DIY HSLSRARPFW FPLiZLLLLSV 3000 
GVGIYLLENR 
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#2. Strategy for constructing chimeric clone of HCV (pH77CV-J4) 
which contains the nonstructural region of strain H77 and the 
structural region of strain HC-J4 



5'UTR 



delete 



C+E1+E2+P7 



fiYA 



pCV-H77C 



MNS2 * NS3 + NS4A4- NS4B * NS5 A f NS5B ¥ 





3'UTR 



.4gel(156)C/al(710) ^^^^^^ 



I £co 47 ill (blunt end; 2851) 



1 



P(7P procmqts 



Ctt TAT G. . 

A/de I • (2763) 
— ►C/a IA/di9 1 f CO 47,111 



Hind\\\*Me^Afl\\ 

(7862) (9160) (9403) 
gc G ATA ■■ T 6t 
gcr MA TGt 

H7851S— ♦ H9173R(P-M) 
8 

— C 

H9417R 



Fusion PCR 

H9140S(P-M) 



H27SlS(Clal/Ndel) 
Xho I (2282) 



H2B70R 



pCV-J4L6S 



Nde I • (2763) 
' D 



J4.2271S 



J4-2776R(Ndel) 



1 Fragment A, B. C and D ; PCR amplification from pCV.H77C or pCV-J4L6S 

. Fragment A ; additional C/a I site, artificial Nde I site induced by a single mutation 
(C-»T at nt 2765 of H77C) and authentic £co47 III site 

. Fragment 8 and C ; eliminated Nde I site by a single mutation within the primers 
(C-*T at nt 9158 of H77C) , and fusion PCR with both fragments 
. Fragment D ; artificial Nde I site induced by 2 point mutations within the primer 
(T-»A at nt 2762 and C-*T at nt 2765 of J4L6S) 

2. TA cloning of PCR products 

3. Sequence analysis 

4. Cloning of Fragment A (C/a X-Bco 47111 ) and Fragment B/C (H/nd Ili-Af/H ) with correct 

sequence Into pCV-H77C 

5. Complete sequence analysis of new cassette vector tetEZSia. Into which the structural 

regions of different genotypes can be inserted. 

6. Cloning of Fragment-Age MXho I (cut out from pCV-J4L6S) and Fragment D (Xho l-A/de 
with correct sequence into the new cassette vector ; 3 piece ligation 

7. Complete sequence analysis of la+lb chimera [ DH77CV.,f41 

8. In vitro transcription (within 24 hours of inoculation) 

9. Percutaneous Intrahepatic transfectlon Into chimpanzee 

FIG. 15 
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QCCAGODCXr TGAia3333C GACACICCAC CA.1GAATCAC T CCGJ iUilA 50 

QGAACIRCIG TCTICAD3CA GAAAQ33ICT AGOCAIOSDS TISCTmSAG 100 

' iUiUJIO: ? ^ (DCroCAGGfiC OUULUJiOX' G33AGAQCrA TfiGI OJi L'i G ISO 

O33AAC0QSr GAGEWZA003 GAATiaXPG GAOGAOC3333 ' iUL'lTlLTm 200 

GA.TCAA0003 CICAAIGCCT aSfiGATTIOS ULUiGCJGCJGC QC3GftGACIGC 250 

TAGOOCaRGIA GflGnOOSIC GOSAAJ^OaOC TIGiaGflRCr GOnKSMaOG 300 

(JlULTiUULA (ilUUG C D 333 AOSICIOSIA GAOJJiUCAC CAlGAQCaCE 350 

A ftliUL ' iA AAC CICAAAGAAA AADCJ^AACCT AACAOZAAaZ C30CI33OTCA 400 

G3A03IC?VAG 'i'lCCG33333 GIG3ICAGAT UaTiLUiULSA Gi'i'JjAOJiUi' 450 

1G0CXXX3CAG aaOOOCCAQG TIG33IGIQC G330GACI2^ GASaJLTiUJ 500 

GAGUGGjimZ AACXnOSIGS AM33D3ACAA LUllALLLUJAA AGGCiLUJUS 550 

;iDC3CX3AQ33C AL*JJLL.'iU33 CICAGOOOaG GimJL'i'iULi COOCICERIG 600 

aCAAIGAQ33 OJiUJULj i UG GCAGGAIQ3C 'iOJiGiCAOC 003033010: 650 

OaQO033\GIT GOOGOOOCAC GGAIDO0O333 CJmJJDJJZ GYM^CUVGOG 700 

TAAQ3ICAIC GATA LU- ' i'lA CAlULUJL'i'i' OSOOSMCIC AiULUUEACA 750 

'ITCGSl'ICGT OSQOSOCCCC CrAQ333303 CIG0C35O33C C'i'iUJLJACAC 800 

GGriGICCG 3 3 TICIQGAQGA OQ30GIGAAC TATOCIAACPG O^AACnOOC 850 

CQGnOCICr 'I'ICTC IA ICT TOOICTIQSC: TCIOOIGIOC TGmGAOCA 900 

TCOCAGCnC OSOmiGAA GTGGGCAADG 'iG'10033GAT AmDCAIGIC 950 

ACS^AOSACr GCIOCAACIC AAQCATIGIG T?vIGAG3CAG 03GAa3IGAT 1000 

CAIQCATACT OeXJGJGIGOS T3L U, ' 1GIGT TCAGGAGGGT AACAGCTCOC 1050 

Ui'iUL'i ai iG T AQ03CICACr 00ZAO3CIO3 CG3aC?i3GAA T3C]C:AG33IC 1100 

CCCACIAOCSA CAAIAQGAD3 CCAOJDOG^C TIGOiaJETG GGAOJUL'iUC 1150 

' ll ' lL ' iUL ' iC C OdATCTAOS T3Q33GA3Cr CIG033ATCr AiTi'iUL'iULi 1200 

TCIOOCT^GOr GnCAOOnC TOGOCIOSOC G3CAIGfiGAC AGIOZAGSfiC 1250 

TaCAAL'i G C T CA AlUimOl' COaOCAIGIA TCAQGICftGC ULmUJL'i'iG 1300 

. GGMKCGftTO AIGAACIQ3T CAOOIACAAC ^OXCIPdG GTGTCOCPCT 1350 

Ta CiC CG G AT OOCACAAOZr GIOJCGGACA 'iUJiOXOGG oaCOCACIGG 1400 

G3AGIOOIG3 a3330CITGC CIAL'im'iLC AIGGTCAOXA ACiOXCIS^ 1450 

G Ji ' lLlG ftTT GiaSOSOTAC 'iUlTiULLUJi CGi'iGAC333 GPGROJOftCA 1500 

03Aa333GAG GJ1U GCC G33 CACACCADOr Q0333ITCAC (J iLU-' i 'lTiC 1550 

TCAICD3G33 03ICICAGAA AATOCAGCIT GIGAATAOZA. Aa33CAGCIG 1600 

OOACAICAAC AQGACnxrx: TAAATIOTA TG?^'iUJ:'iC CAAACl G33r 1650 

' iUri 'l U L UJ C OCIGrmAC GCACACAJCT TCAACT^GIC 033GnGQa03 1700 

(S^OCGCATOG CCAGOItXnS CXDOCATIGAC ' lUJ l'lUXCC J^3333I^ro 1750 

CXXX::AICADC TATACTAAQC CTAACAGCTC GGAICAGAG3 OOTEATiaCT 1800 
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G3ZPnTPa32 QOCroSAOaG ' lUIGG .' Ki T CG TAOOOQaSIC GCM33B3E3T 1850 

GSIOCAGIGT A I ' IUITIUA C eXTAAGCXXT Ui'iUlUi:?iUG G3ftaaO0GA> 1900 

TO3nCCX33r U l LLL'-L A OGT ATSOl'iGQGG GGAGAAIGAG ACftGAOGIGA. 1950 

1GCICCICAA CAACA030GT OQQDCACSWS GCAACIOGflT CmTIGIACA. 2000 

G mJiUJUlT CACIAfiGftGG TG333Aa3IC GOOGGiUlIftA 2050 

CA L LUjGG33G G ' iUJJI^ft AOC GCAUJi'iUAT CIGOOOCACG GRCTQCTIGC 2100 

QSAAQcs^GQC osrarrRcr tacacaaaat Giaacroaos GoocrosriG 2150 

ACfiLXJI fiG Sr GOCIMflRGA CI3yDC33a3AC AQ3CmQ3C ACIZmXTC 2200 

CA L ' iUlCA AT TrriULAlC T TEA^ ^l ' liA G GAIGIMGIG GG33aa3IQ3 2250 

AOZACS£33Zr CAAiaD03CA TaCSATIGGA. CICGA03PGA GCOJiUIAAC 2300 

TIQ3AG3ACA GGGATAQSIC AGAACIX^GC CDSZIGCriQC TGICD^CAAC 2350 

i!^GAGIG3ZAG AmJ iU-CC T (JiUL'i'i'iCAC CALUJilACCXS QCITMGCA 2400 

CiUJi ' i 'i l AT OCmC'IGC ft T CAGAACAiaS TGSAOJiULA AiaOJi m!ftC 2450 

GL?l GTAGGGr CAQ Ub ' ri ' iUi ' C'iULTi'iUA A ggOgOC T 2500 

Gl ' lULTl ' l ' lL CriCICCIQS OJiUiGIGlC 'iUJi'iUiUA 2550 

TGATOTIGrr GATAGGOC^^ OnGAG3303 CCITAGAGAA CnOCTGGIC 2600 

CICAAIGOSG OG?IOCXnQ3C CQ3AG0GCAT OJCOTCICr CCnTCnGT 2650 

GTICnCIQC GCOGCCIGJr ACATEW^03G CAGSTIGaCT CCia333Q33 2700 

GCTAIGCnr TTAIGGCXJm 'iUJLLUJiUC ■ILL.'iUL'iCLT i^rnmCTm 2750 

CCACCACGAG CATAIGCACT GGACACX3GftG GiaSOCXSQGT O3IGIQC303G 2800 

OJl ' lUi'lUlT G I LUiJl'l V A TG303CrGAC 'iL'iGTOXCA TATTACAAGC 2850 

GCrrATAICAG CIGGiaiMG T33It33CnC AGTAl'iTiUi' GADCAGAGm 2900 

GAA033CAAC TGCADGIGIG GGTICCOGO:: CTCAAOGICC 0333333333 2950 

CGATOCCGIC ATCITACICA. TGIGIGIAGT ACAD3DGACC L'i UJJATn G 3000 

ACAICACCAA ACI7CICCIG GOCA LLUl ' lLG GADQOCTITG G Al'iUi'iCAA. 3050 

accAGrnoc TiAAAGiaoc c mjriajiG cgcxjitcaag Qa.'i' iL'iD.G 3ioo 

Gft LLUTGC GCG CTAOOGDSoA. i:^GAIAG303G AGGICATTAC GIGCAAMQG 3150 

CCAICAICAA Gim333GG3 CITACIG3CA CCEAIGIGIA TAADCAICTC 3200 

AOO LUiUriC GAGACIG33C QCACAAD332 CIGCGPGSJC UGL U-GiaaJ 3250 

TGIGGAACCA GlLGiUi ' iC T G^DGAAIGGA. GPOC3^?GCIC AIC AlGIU^ 3300 

033CAGAIAC OSCXmSIGC GGIGACA3CA TGAAOJJ-Tr O-UJJiUiL'i' 3350 

QQCCX7EAG3G GCCAGGftGAT A L ' lUL ' i ' iUGG CCWSSOGAOG GAA3G3ICIC 3400 
CAAG333IG3 A OJi ' lU Jl G G CG3CCATCAC G333rACm: 3450 
GAQSLL-'iCCT AGSGIGIMA ATCPCCN3CC TGACIGSOCG GGACAAAAAC 3500 
CAAGIGC3AG3 GIGAQGIOCA GAID3IGTCA PCIGCrAOX AAAOCITDCr 3550 
G3CAACGIG2 AICAATGSSS TATGnGGAC l UiCIAOCAC GUG3303GAA 3600 
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OGAQGACCAT OacmCACOC AAQQGIOCIG TCATOCAGAT GISOSVCJCAAT 3650 

GTOGAOCAAG A UL ' i'iUiUJ G C1G3C033CT 0CICAi^33IT OOaXTECKIT 3700 

GftCAO CCTCT ACCIGOQQCT GCI033ACET TTAOCIOGflC AOGPOaCACB 3750 

OCXSOGICAT 'iOXGIGOaC 03000^0310 ATfiGCAGOGO TS^GOOIQCIT 3800 

■j^^^^-Y-rt-rry^ (Xm ' i ' iUL ' m CnGAAAQOC 'iaJI033333 (JiULUL'iUiT 3850 

GIQaCCaSOG OGACACaOOO TGOOOCIMT GAOOOOOSOO GiGiGCAOOC 3900 

GIQOAGTOOC TS^AAQOOOIG GAL'i'i'imCC CIGIGOAGAA CCTAGOGACA 3950 

ACCATCAGAT QCaa3GIGIT CAOSGACAAC TCCICIOOAC GAGCAGflOX 4000 

CCAGAQOnC CAaOiaOOOC AOCIGCaiQC •KDOAOOOOC AQG03EAAGA 4050 

QCACCAAGCSr CCXJCJX1G33 TA03CAG03C Oji Uncm G 4100 

cicAAOOQcr ciGiroTiac aaooctoooc: Tnaoianr acaigxgcaa 4150 

Q3CXrATO3G GTIGAIOCrA. AIAICAQOAC 03333IGAGA. ACAATEAOOA 4200 

Cia3CAGCGC CATCAOGTAC TDOACCIADO GOAAGTICUr T33JGAQaGC: 4250 

GGJIUC'ICAG GAQOIQCTIA. TGACATAAIA ATTIGIGACG AGIGOCACIC 4300 

CAOOGAIQCX: TO^iaCATCT TOOOCATOOO CACIGICCIT GACTAAOZAG 4350 

AGACIQCX300 QOQGAGACIG Gi ' lUlUL'ICG, QCACiaZIAC aXJiaOOQQC 4400 

TCQOICACTG IGIOCCAaXX: TAACATCOAG GfiG^TIQCIC IGTOCACCAC 4450 

CQGAGAGATC COCinTADG GC3^AOXTAT COQOnOOAG GIGATCAAQG 4500 

G3Q3AAGACA TCICAICTTC TG32ACICAA i^GAAGA^CTO OOAOOAGCTEC 4550 

aCGGGGAAOZ TOOICQIATT G330AICAAT GJ JJiO^OCT ACIALU.03 3 4600 

ICTTGACGIG TCTGICAICC CGAO0AG033 CGAIGITGIC GIOOIGTOGA 4650 

CXXSAIQCICr CAIGACIOX TmC03303 ACITOOACIC TGIGAITOiC 4700 

IQCAACACGT GIGICACICA GACAGIOOAT TICfGOCITG ADOCJIAaCrr 4750 

TAOCATIGAG ACAAOCAOSO TOOOOCSGOA TOnGICiaC ^^ QOACICAAC """^ 

QcaoaaacAG gacigooaqo gogaagocag qoaiciaiag ottctooca 

CCQ330GAQC OCXXCTOOSO C AliUi'lCG fiC lUJiUJOTOZ 1CIGIGAGIG 

CEAIGA0333 GOCIGIGCIT QOIATCAarr CPO^CCCCCC GAGACIACAG 4950 

TEAQ33]ACG AQOCTACATC AACALXXD33 GXTiaOCT GTOOCaGGAC 5000 

CAiCriGAAT ' iTlGG3 AG33 CUlLTi ' rA OO Q33CIC ACIC j^P^ GSJGC 5050 

QOA LTl ' l ' l ' IA TCQCAGACAA. AGOAGfiGIGO OGP^SAAdTT CCTmjJi^ 5100 

TAOaSTAOCA AOCJCACXCTG T3G3CT?m3 CICA?£3aDDC TOmggaS 5150 

TtSOGACCAGA TGIQGAAGIG TTIGAian: CITAAACOC?^ OOOTOCAIGS 5200 

GCCAACACQC dOZCATACA GACIG33330 'iUriO O^ GAAGICACOC 5250 

TGACXaCADX AAICADOAAA TACATCAIGA. CAT3ZA3GIC GOOCXSAOCIG 5300 

GAOSIQOICA OOAGCAOriG QGflGZia^IT 0303300100 TOOCIOCICr 5350 

GOOCQDOrAT TG O- ' lU i C AA CAGSOIOOT 03ICATAGIG OSCAQOAJiaG 5400 

FIG, ISC 
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' iL ' i'lUB XSS ATMaOCTS ACAGGGftGST ■lyjisjLtnZRE 5450 

GRGiraaiG AGftlGSftAGA ( JiUL'lL'iCA G CACTCOCT ACATOSftGCA 5500 

JVQQGMGATC ClU J Ll GAQC: AGirCAAGCA GAAGCJUUCIC OSJUCOTOC 5550 

AGAC0333IC CXX3GCAIGCA GAOSITATCA OLUJiUCIGr OCAGAOCAAC 5600 

TO3CaGAAAC HXSAiSSICrr Tia33C33AAG CACAIGIOGA ATITCAICRG 5650 

TOQ3AIAC3A TaCnG3C333 GOCIGICAAC O-'lUULUULJi' AAOaCTOCA 5700 

TiuL 'i' i c A TT (aiaacmr acaol' i gogg TCAoaGoac pciMCC Pcr 5750 

QQOCAAflaOC TCClCnCAA CAJai'iG333 0331003103 CiC mZAQirr 5800 
Q j^jjtimC GSIOOTTEA ClULUi'l ' lUi ' O03IO3IO3C Cl^OCiaaOS 5850 
CCX3XATO33 CAGC33na3A CIG332AAQG TXTCXID33A CmiLTiUJA 5900 
C33JCAIO30G a33QCJ3I03C: G33AG3IdT GEAGZATICA AGAICAI GAG 5950 
0GGIGAG3IC 0CCI0CA033 AQGAQCTOGT CAgCIGCIG OCr m^TO: ( 
' lUiLUw L'iCG AQ UULTiUm ( JIUUUIUIUj 'lUiOCDZfiQC AAJ3\tJi^AJA, ( 
(CTOSOSITC QCXmsaOGA GQ0aQC3«3IG CAAI^^ ' 
AQQCTiaXC TCCn33333A AOIMCTITC COO^^ ' 
AGAG33AIOC AQCmiXm: GICACTGOCIA TACICAGCAG O CnXZAC TGIA i 
i^OCJCAQCIOZ TOAG32GACT GCA.TCAGI03 ATAAGC T333 AGIOIAOCAC 
TCCATO^ICC OOnCCIGSZ TAA03GACAT CIOaGACTIOG AIMOCXSAOS 
laciGAQOGA. CITTAAGACX: T33CIGAAAG CCAAQCICAT GOCACAACIG 
CCI03GAnC OJlTiUiG I C CIOO^AOCXS: G33rAm533 Q33ICIOQC33 
AQGAGAOaaC ATEAIQCACA ClLUJi G OCA CIGIQ2fiG3r GAGATCACIG 
CSOOOICAA AAAa333AD3 ATGAGGAID3 TOaSTOZrAG GACXTIOZAGG 
AACAIGI03A GI033AC?3IT CCQCATIA^C G^CT ACAOCA C^^OOCTG 
TACICCXnr OCIQOGOCXSA ACTAIAAGIT LUJU L'iUiO S AQ33IGICIG 
CAGA03AATA OGIOGAGAIA AO30333iaG OOSACnOCA CTAaSTAIOS 
QCTAiGACm CIGACAATCT TAAAIOOCXDG T3ac:AGA3a: CAll ^u,OXA 
MTmCACA GAATI03A0G QGOIOCGXT ACACPGSnT QLUJ-CCCTT 
QCZAAOOGCIT OL'i03 33 3AG GAQ3IAICAT TCAGAGIPOG ACiraOGfiG 
TACCCX3GI03 OSIOSCAATr A0CTI03M OC30GAAC03G ACGMOOT 
GTIGACGICC AI03ICACIG AlLUJiCCCA TA TAACAGC A GALj^fc- VA: *,U^ 
03AGAAG3IT GOOGAGAOaG TCALUJLLTi' CrATOOCCAG CIOCia33CT 
AQOCAQCIGT (XQCTCCaTC TCICAAQOC^^ ACTTQCAOT GCAAOCAIGA 
CiCCQCIGAC GCTGAGnCA TAGAGOTIAA OJiUJiUiGG AG3ZA GGAGA 
-roSSCQOCAA CAICADZAGS GTIGAGICPG AGAACAAAGT QOI^TCTG 
GA L ' lL LT lL b AJXrXS L ' i ' iUi ' GQCAGAGGPG GAIGAGCX33G AQJiu:ia:gr 
ACCIOZAGAA AnCIGa33A AGICIOSSPG ATiaXXDOSG GCHJiUJQX- 
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1CIQ330Q0G GOOQGACIZC AAGOOGQCOC TSCTRGftGAC GIGGAAAAAG 7250 

CCIGACIACG AADCRDCIGT G3ICCA3Q3C 'iUUUUUL'rAC CAOnDCAGG 7300 

GlUXCICCr GTOOCimSC CIOSSAAAAA G03I3O3GIG GIOZICAOOS 7350 

AATCAAOCT ATCDOao: TIOaOOSfiGC: TTGOCaOCAA Afi LjlTi'lULL : 7400 

AQCICCICAA CnO033C3^ TAD33303AC AKEA03ACAA. CALLUL'iL'iUA. 7450 

QCXX33CXXXJr TCIQXTGCE gjGGOSACIC CXSAOSriGRG lOCIKncrr 7500 

CCAIGOOGOC C3nQ3fiQS33 GAGOdOOaS AlC03jftICr GRGOGRaaaS 7550 

Tc m aaa aA . cmi ooaG loasxjasAc ao33aagatc tcsigigcig 7600 

ClCS^KIGICr TOTOCTOGA. CAOrQCACr CXnCAOXDS TOasnQOOS 7650 

AAGAACAAAA ACIGOOCMT AAOOCACIGA GC3^'iUJi'i' GCiaa3aCAT 7700 

CACAATCIQS TGrrATICEAC CACnCAOGZ AliiUL'i'iQX AAAGOCAGAA. 7750 

GAAAGICACA TTTGACAGAC IQCAAUi'iL'i' QGACAQOCaT •D^CGAOGftaS 7800 

lanCAAQGA QSICAAAGCA. GOJULXJIL A A. AAGIGAAQQC TAALTiUL'lA 7850 

'ICCGTASOS AAGCnOCAG CCIGAOGCXDC OC^CATICSG CXZAA^OA 7900 

GmOQCIAT GQ33CAAAAG ACGIOCTIG OCAIOGCAGA AAGGCDGiaG 7950 

CCCACAICAA CimJEGI03 AAAGALUi'lU TQGAAGACAG TGTAA CAOCA 8000 

AIAGACACIA CCAICAIQGC CAAGAACX3?G Gi'i'i'iL'iODG TTCAGOCIGA 8050 

GAAa33333r CGHZAAGOCAG CiUJiUICAT OGIGITOOC GAOdOOGCG 8100 

1GGQCGIGIG CGAGAAGATC GCXriGEAOG AOGnGGITAG CAAQCICXXr 8150 

CIUJLOJI G A. T33GAAQCIC CIACJSGATIC CAAIACICAC CAGGACAGg 8200 

GGITGAAnC CIOGIGCAAG CXJCQGAAGTC CAAGAAGACC COGAIQGGGT 8250 

■lUlLUilA IGA. TACCCXaCIGT TTIGACIOCIA CAGICACIGA GAOaJGACATC 8300 

CGEAOaGAGG AQGCAAriTA CCAATCTIGT GACCIQGACC COGAAGOCOG 8350 

CGIOQCCAIC AAGIOXTCA CDGAGAGGGT TTS^IGTI^ Ci^LLlLTIA 8400 

OGAATICAAG G3GGGAAAAC lUJUGC'iaCC GCAGGIQOOG OGOGfiG GGGG 8450 

GIACIGACAA CIAGGIGTGG TAACAOOCIC ALTiOZEACA. ICAALOimG 8500 

QGCAGOCIGT OGAGG0GC»G QQCTOCAGGA CTGCAOGAIG CiaGTGIGIG 8550 

aCGAOSCTT A GlUJi'-LA TC TGIGAAAGIG GG33GGICCA. GGAGGAOGOG 8600 

QQGAGOCTGA. GAGCmCAC GGAGGGTAIG ACGSGGTRCr COGGODGOGC 8650 

CX33GGAQCCC GCACAAOGAG AATAGGACIT GGPGCTEftIA A CAICA IGCT 8700 

CX:iGCAACJr GICAGTGGOG CACGAOaGOG CTOGAAAGfiG QGICI3\CIAC 8750 

CnACQOGIG ACXCTACAAC CULLUiLGlG AGWGLUJCCr OGGfiGACAGC 8800 

AAGACaCACr CCAGICAATT CCIQGCnaGG CAAC ATAATC ATCm^^ 8850 

CCACACIGIG G3GGAGa=iIG AIACIG?^JGA OCCALL'i'iUiT TAGO ilV-ulv, 8900 

ATAGCX:3^GC3G AICAGCTIGA ACAQGCTGIT A^CTGIGSGA TUIAOGGAG: 8950 

CIQCTACICC ATAGAAOCAC TGGAICIACX: TCCAAICATT CAAAGACiaC 9000 

FIG. i6E 
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AioaGcicftG aacArmcA ctccpcpgtt Acicrocftos tgaaaicam" 9050 

AQasiaaaoG caroaciCAG AAAftcnms gioootoct laaGfiQCTiG 9100 

GAGACACCJQS G30a3GAGC33 TOOSDSTiaG GCTICIGIDC AGAQSAOaCA 9150 

G3GCiaCIAT AIGIOQCAAG TACCICTICA ACIGSSZZO? AAGAfiCAAflG 9200 

CICAAACICA. COXXS^AIADC QOOCXSZiaaC 033Cia3ACT •Hoivjo^i -Ha 9250 

GnCAa33CT QOCIACAQOG OQSGAGACKr TMCACRa: G1GICIC2V!IG 9300 

rr^rryyrrm ClULtriUl G G TITILLCTA C 'iUL'iUL'iLU: TaMSaGflA 9350 

QQCAICEftCX: laCroOOCAA GCEAIGAAGS nmSGTAAA CAL'iU,UiJ^ 9400 

TCTiAAQacA Ti ' iLUiui ' ri ' ' i ' rrri ' i ' i ' i ' i ' i ' i - iTiTi ' i ' i ' i'r I' lTi'iLTiT r 9450 

'I ' lTi ' i'i'iui ' i ' Tccmocrr crnrma: 'iTi ui'i'i'i' iL aji'iurriAA 9500 

1QGIG3CICC KTCITMSOOZ TAGICA0Q3C TAOCIGI^A AQSIDaSIGA. 9550 

QOCXSCAIGAC T3CAGAGAGr GCHGAIPiCIG GLUiUiCiVX AGATCATOT 9599 

FIG. I6F 
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10 20 30 40 50 

^7'^A^6r7Q90 1234567890 1234567890 1234567890 1234567890 

MsmPKPQE^K TKFNINEWPQ DVKFPQQaQI VOSVYLLPRR GEKDGWRKER 50 

KZ^SERSQERG RRQPIEKZVPR EEESRftWAQPG IH'iAGWLtSP 100 

PGSRPSWGFT DERBRSRNLG KyiDTLTOSF KDLIGTIPDJ GAEL33AARA 150 

LAHG^/RVLED GVNYMmLP QCSFSIFLIA LLSCLTTPAS AffiVKNVSGI 200 

■VH7INDCSNS SIVYE?^AD7I MHTPQCVKV QE2NSSF5CW ALTPUAfiFN 250 

ASVPnriER HVDLIiVGmA. PCSAMYVOX 03Sim/SQL FTFSERRHBr 300 

VQDCNCSIYP a^VSGHEMfiW EMMMNWSPIT ALWSQtIKE PQAWEWR^G 350 

AHWGVIAGLA YYa^^O*«MC VU^ffiLLEAG VJJLihliHi'iUK VA£3ii'iyGbT 400 

SLFSSGAS^C IQLVNINGSW HINFTMICN DSLQTGFEAA LFC^KENSS 450 

GCPEKMASCR PIDWEAQOWG PriYIKHQSS D^RFiOHC^ EE^POSWPAS 500 

QVOGPV:£CFT PSPyWGfTID PSGVPIYSWG ENEIIM-ILLN NIRPPQGJSWF 550 

GCTAZMNSIGF TECDDGGPFCN IQ3VGNFTLI CPIICERKHP EAIYIKDGSG 600 

PWLTEECLVD YP!£RUWEKPC TUIFSIEK^/R MYVOGVEHRL ISC^OIWIRGE 650 

pCNLjEERERS ELSELLLSIT EWQrLPCAFT TLPALiSIGLI HmSNIVDVQ 700 

YLY3VGSAFV SFAIKWEYIL IIjFIJLLADAR VCACLIAMMLL IAQAEAMJEN 750 

LWLNAASVA GAH3ILSFLV FPCAAWYIKG RIAK3AA2^AF ^lEVWPLLLLL 800 

LALPPRAYT^ DTEVAASCXSG WLVGLMALT LSFfYKRYIS WZtM^CJQfYFL 850 

TF[VEAQLHVW VPELJNVRG3R DftVTT.TMZW HPILVFDITK LLLAIP3ELJW 900 

IIj3?^SLLKVP YFVRVQSLLR ICAU^KOAG OfYVgyMIK DS^TGTirvy 950 

NHLTELEDWA FMrn-PTTAVA VEPWFSHME TKLnWoADT AA03DII3SK3L 1000 

P7SAPK3QEI IU3PAEX3MV'S KOWRIXAPrr A^i^QTRGLL GCIITSLTGR 1050 

DKN!2VEGEVQ IVSTAIQrFI. ATCINGVCVJr VYH^AGIRIT ASPKGPTigyi 1100 

YINVD2DLVG WPAPQGSP.SL TPCIGGSSDL YU-JTPHMJJl PVPPHGDSRG 1150 

SLLSPRPISY LXGSSGGPLL CPA!2^AyGLF RA.-Jy'CTRGVA KAVDFIPVEN 1200 

DGTIMRSPVF TCNSSPPAVP QSPS^/AHLHA FIGSGKSIKV PAA^iS^AQGirK 1250 

VLVLNPSVZA HJ3FG?^i!MSK AHSVDENIPT GVKiTi'iUSP TIYSTiraKEL 1300 

ADQQCSQGAY DIIICCEEHS TmESIDSIG TVDDQAEIAG AHLWLAIZff 1350 

PB3SVTVSEiP NIEEVM^TT GEIPF^GKAI PLEVTKQGRH LIECHSKKKC 1400 

DELAAKLVMj GINAVKirYPG m/SVIPTSG EWWSIDAL Mi^^- ikiut'LS 1450 

VIDCNICVTQ TVDFSLDPTF TlEi'i'i'LPQD AVSKIQRPGR TQRGKPGIYR 1500 

PVAPGERPSG MEDSSVLCEC YDAGZftWYEL TP.AEITVRLR AJ^MNTPGLPV 1550 

CQEHLEEWEG VFIGLTHim HFLSQIKQ33 H-FPYLVAYQ MVCARAQAP 1600 

PPSWD3MCC LIFLKPTLHS FIimnFiKSA. VigSEVTLTHP ITPCmilCJyB 1650 

ADLEWTSTW VLVGGVLAAL AAYCL^SIGCV VTvGRIVLSG KPAIIPEREV 1700 

LYQEFDEMEE CS<;^ILPYIEJ2 CMXILAEQFKQ KZUJSLLOTAS FHAEVITPAV 1750 

QINWQKLEVF WAKHM/^NFIS GIQYIAGLST LPOSIPAIASL MMTAAVTSP 1800 

LTIGOTLLfN IDGGWVAAQL AAPGAATAFV GAGLAG?AIG SVGDGKyLVD 1850 

IIAGYQAGVA dALVT^JKH'S GEX^PSTEDLV ^^LLPAILSP3 ALWGWCAA 1900 
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H77CV-J4aa Sequence 



10 20 30 40 50 

l9^A'^fi789Q ^r>'\d^e^^90 ^2'^4S€^890 1^-^4567890 1234567890 

illSHVGEGE Gl^SJQ/mBU: AEASRGNH7S PIHYVPESEA AARVEAUSS 1950 

LTVIQIXRRL BSWISSETIT PCSGSWLRDI WDW ICEVI ^ EEOWLKRKm 2000 

B2LEGIPFVS OSAETiaiVK NSIMRI VCTR 2050 

TCENM^SGIF PINAYTIGPC TELPAENXKF ADWRVSAEEY VEIRKVGCFH 2100 

YVSGMITENL KCE03IESEE FPIEUDGVRL HRFAPPCECEL LREEVSFRyG 2150 

IHEXPVGS2L BZEEEETWAV LTSMLTDPSH riAEAAGRRL AEGSPPSMAS 2200 

SSASQISAPS LKATC33KNHD SH»ELIEftN LIJWRQEM3GN nRWESENKV 2250 

VUDSFDELV AEEIERETSV PAEILRKSPR EARALPVWAR PDVaOPELVET 2300 

WKKPDiEPPV VH33SLPPER SPPVPPERKK RIWLTESIL glftlAELAIK 2350 

SR3SSSISGI TOWnTSSE PAPSOCPEDS EWESYSSMPP LEGEPCS^PDL 2400 

SDGSWSIVSS GADIECWVO: SMSYSWIGAL VTPCAAEEGK UPINALSNSL 2450 

Li^HHNLWSr TSRSAOQRQK KVIFCeUSVL DSHYQEVLE^E VKAAASKVKA 2500 

NLLSVEEACS LTPPHSAKSK PGYGAKD7RC HAPKAVAHIN SVWKDLLEDS 2550 

VrPIDITIMA KNETHirVQPE KGC3?KPARLI VFPDD3VRVC EKMALTOVS 2600 

KLPLAVM3SS YGB2YSK32R VEFLVQRWKS KKTEMZFSYD IPCFDSIVrE 2650 

SDIRIEE2!^ QGa3LDP3AR VAIKSLTEFL YVGGPLTNSR GETCGYFPCR 2700 



ASGVLTISCG lOTJICYIKAR AACRAAGU2D CIMLVCGDDL WICESAGVQ 2750 

EDAASLRAFT EM^TRYSAPP GDPPQPEYDL ELITSCSSN^/ SVAHD3AGKP. 2800 

VYYLTRDFTT PLAPAAWETA RHTPVNSWLG NlllXIr AFII>J APHimiHFF 2850 

SVLIARDQLE QALNCEIYGA CYSIEPLDLP PIIQRLKXS AFSLHSYSPG 2900 

EINF\/AACLR KD3VPPLRAW PHBARSVRAR LLSP^3C3?AAI CX2CYLFNWAV 2950 

RTKLKLTPIA AAGRLDLSGW FEAGYSOGDI YHSv^SHARPR WFV3FCLLLLA 3000 

AGVGIYIXHSr R "^^"^ 

FIG. I6H 
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#1a. 3' Deletion mutants of pCV-H77C 

Sequence of 3' untranslated region of pCV-H77C 
5'UTR 



ORF 



13'UTR 



( 3* variable region; 43 nts ) 3' variable polyU-UC 3'cons«ved 

ISA (Stop codon for polyprotein) 

AGGTTGGGGT AAACACTCCG GCCT CTTAAG CCATTTCCTG 

(poly U-UC region; 81 nts) " 

TTTTTTTTTT ^^^^^^^^^^ ^^^XTTTTCT TTTXTTTT7T C7TTCC77TC 
CTTCTTTTTT TCCTTTCTTT TTCCCTTCTT T 

(3* conserved region; 101 nts) 

AATGGTGGCT CCATCTTAGC CCTAGTCACG GCTAGCTGTG AAAGGTCCGT 
GAGCCGCATG ACTGCAGAGA GTGCTGATAC TGGCCTCTCT GCAGATCATG 
T 

#1a -1. pCV-H77C(-98X) ; 3' 98 nucleotides removed from pCV-H77C 

TGA AGGTTGG GGTAAACACT CCGGCCT CTT AAG CCATTTC CTGTTTTTTT 
TTTTTTTTTT TTTTTTTTTT TCTTTTTTTT TTTCTTTCCT TTCCTTCTTT 
TTTTCCTTTC TTTTTCCCTT CTTTAAT 

#1a -2. pCV-H77C(-42X) ; 3' 42 nucleotides removed from pCV-H77C 

TGA AGGTTGG GGTAAACACT CCGGCCT CTT AAG CCATTTC CTGTTTTTTT 
TTTTTTTTTT TTTTTTTTTT XCTTTTTTTT TTTCTTTCCT TTCCTTCTTT 
TTTTCCTTTC TTTTTCCCTT CTTTAATGGT GGCTCCATCT TAGCCCTAOT 
CACGGCTAGC TGTGAAAGGT CCGTGAGCCG CAT 

#1a -3. pCV-H77C(X-52) ; All of the 3* UTR sequence, except 3* 49 nucleotides, 
removed from pCV-H77C 

TGAGCCGCAT GACTGCAGAG AGTGCTGATA CTGGCCTCTC TGCAGATCAT 

FIG. I7A 
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#1a -4. pCV-H77C(X) ; All of the 3' UTR sequence, except 3* 101 nucleotides, 
removed from pCV-H77C 

TGAAATGGTG GCTCCATCTT AGCCCTAGTC ACGGCT3U5CT GTGAAAGGTC 
CGTGAGCCGC ATGACTGCAG AGAGTGCTGA TACTGGCCTC TCTGCAGATC 
ATGT 

#la -5. pCV-H77C(+49X) ; The proximal 49 nucleotides of the 3' conserved 
region ( 98 nucleotides ; AAT not included) removed from pCV-H77C 

TGAAGGTTGG GGTAAACACT CCGGCCTCTT ftftgCCATTTC CTGTTTTTTT 
^^j^jjjjjj TTTTTTTTTT TCTTTTTTTT TTTCTTTCCT TTCCTTCTTT 
TTTTCCTTTC TTTTTCCCTT CTTTAATGCC GCATGACTGC AGAGAGTGCT 
GATACTGGCC TCTCTGCAGA TCATGT 

#1a -6. pCV-H77C(VR-24) ; First 24 nucleotides of the 3* variable region 
removed from pCV-H77C 

TSACTTAAGC CATTTCCTGT TTTTTTTTTT TTTTTTTTTT TTTTTTTCTT 
jj^jTT^XTC TTTCCTTTCC TTCTTTTTTT CCTTTCTTTT TCCCTTCTTT 
AATGGTGGCT CCATCTTAGC CCTAGTCACG GCTAGCTGTG AAAGGTCCGT 
GAGCCGCATG ACTGCAGAGA GTGCTGATAC TGGCCTCTCT GCAGATCATG 



T 



#1a -7. pCV-H77C(-U/UC) ; Poly U-UC region removed from pCV-H77C 

2SAAGGTTGG GGTAAACACT CCGGCCT£T1_AAGCCATTTC CTGAATGGTG 
GCTCCATCTT AGCCCTAGTC ACGGCTAGCT GTGAAAGGTC CGTGAGCCGC 
ATGACTGCAG AGAGTGCTGA TACTGGCCTC TCTGCAGATC ATGT 

FIG. I7B 



JSCJOCID: <WO_9904008A2.L> 



SUBSTITUTE SHEET (RULE 26) 



wo 99/04008 



PCTAJS98/14688 



#1b. Strategy of 3' Deletion mutants 

#1b-1.pCV-H77C(-98X) 



3' variable 
region 



poly U-UC 
region 



3' conserved 
region 

(9Bnts) 




Xbal" 

1. PCR Amplification 

2. Purification of PCR products 

3. Digestion with Af/II and Xba I 

4. Cloning of Afl II /Xba I fragment into pCV-H77C 

5. Complete sequence analysis 

6. in vitro transcription (within 24 hours of inoculation) 

7. Percutaneous intrahepatic transfection into chimpanzee ; 11/26/97 and 12/17/97 

8. Result : Negative ( No replication) 



#1b-2. pCV-H77C(-42X) 

3' variable POly U-UC 
region region 



3' conserved 
region 

I (42 nts) 




Nhe I (9530) 




Aflll (9403) 

Hvnthesizi^d Oligonu cleotides 



1. Synthesis of oligonucleotides ( sense and anti-sense ) 

2. Hybridization of oligonucleotides 

3. Digestion with Nhe I and Xba I 

4. Cloning of Nhe I IXba 1 fragment into pG9-KL26 (3* UTR of H77C) 

5. Sequence analysis 

6. Cloning of 3* UTR ( .42X ) {Afl 11 IXba I fragment] into pCV-H77C 

7. Complete sequence analysis 

8. in vitro transcription (within 24 hours of inoculation) 

9. Percutaneous intrahepatic transfection into chimpanzee (Schedule; 1/22/98. 2/5/98 ) 



FIG. I7C 
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#1b-3. pCV-H77C(X-52) 

3' variable 
region 
ITGAI 



i 



NS5B 



poiy U-UC 
region 



I 



3' conserved 

region 

(52nts) I (49nts) 



i 

/^ei(9160) 
a t . 



■iiaiiiii 



Pfu PCR 



Synthesized Oliaonucleotides 

i Lb 

. XbaV 



^ PuRion and Extension 



■ttiiiifii ' 
tiiitiiii I 



1. Fragment a ; Pfu PCR amplification and purification 

2. Fragment b ; Synthesized oligonucleotides (anti-sense) 

3. Fusion and extension 

4. TA cloning 

5. Sequence analysis 

6. Cloning Nde l-Xba I fragment with correct sequence into pCV-H77C 

7. Complete sequence analysis 

8. In v/fro transcription (within 24 hours of inoculation) 

9. Percutaneous intrahepatic transfection into chimpanzee 



FIG. I7D 
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#1b-4. pCV-H77C(X) 

3' variable 
region 



I 



NS5B 



▲ 

Ate/el (9160) 



IIIIIMi 



poly U-UC 
region 



Pfu PCR 



3' conserved 
region 

(101 nts) 



Synthesized Oligonucleotides 



A 

Xbal* 



^ Fusion and Extension 



iiiiiiifti - 
■iitiiiiiii 



1. Fragment a ; Pfu PCR amplification and purification 

2. Fragment c ; Syntiiesized oligonucleotides (anti-sense) 

3. Fusion and extension 

4. TA cloning 

5. Sequence analysis 

6. Cloning Nde l-Xba I fragment with correct sequence into pCV-H77C 

7. Complete sequence analysis 

8. In v/Yro transcription (within 24 hours of inoculation) 

9. Percutaneous Intrahepatic transfection into chimpanzee 



FIG. I7E 
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#1b-5. pCV-H77C(+49X) 

3' variable 
region 



poly U-UC 
region 



3' conserved 
region 

(49nt$) 




4- • : : 

Afl\\(9403) ! ■ ■ Synthesized OllaonueleotldBs 



ttiiiti 



I 



I 



-A* 

Xbal" 



Fusion and Extension 



I ifiiii 



iiiiiii 



1. Fragment d ; Pfu PGR amplification and purification 

2. Fragment e ; Synthesized oligonucleotides (anti-sense) 

3. Fusion and extension 

4. TA cloning 

5. Sequence analysis 

6. Cloning Afll\-Xba I fragment with correct sequence Into pCV-HTTC 

7. Complete sequence analysis 

8. In vitro transcription (within 24 hours of Inoculation) 

9. Percutaneous Intrahepatic transfectlon Into chimpanzee 



FIG. I7F 
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#1b -6. pCV-H77C(VR-24) 

3' variable 



poly U-UC 
region 



3' conserved 



> NSSB 


TGAl(24nts)| 1 


1 


i 




4- 

: 4// II (9403) 






t i* 






■ AWe 1(9160) 









Afin 



1. PCR Amplification 

2. Purification of PCR products 

3. Digestion with Nde I and Aft I 

4. Cloning of Nde I Mf/li fragment into pCV-H77C 

5. Complete sequence analysis 

6. in vitro transcription (wlth»ln 24 hours of inoculation) 

7. Percutaneous intraliepatic transfectlon into ctiimpanzee 



#1b -7. pCV-H77C(-U/UC) 

3* variable Poly U-UC 3* conserved 
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